Spyglass: Fast, Scalable Metadata Search for Large-Scale Storage Systems

Appeared in Proceedings of the 7th USENIX Conference on File and Storage Technologies (FAST '09).

Abstract

The scale of today's storage systems has made it increasingly difficult to find and manage files. To address this, we have developed Spyglass, a file metadata search system that is specially designed for large-scale storage systems. Using an optimized design, guided by an analysis of real-world metadata traces and a user study, Spyglass allows fast, complex searches over file metadata to help users and administrators better understand and manage their files. Spyglass achieves fast, scalable performance through the use of several novel metadata search techniques that exploit metadata search properties. Flexible index control is provided by an index partitioning mechanism that leverages namespace locality. Signature files are used to significantly reduce a query's search space, improving performance and scalability. Snapshot-based metadata collection allows incremental crawling of only modified files. A novel index versioning mechanism provides both fast index updates and "back-in-time" search of metadata. An evaluation of our Spyglass prototype using our real-world, large-scale metadata traces shows search performance that is 1-4 orders of magnitude faster than existing solutions. The Spyglass index can quickly be updated and typically requires less than 0.1% of disk space. Additionally, metadata collection is up to 10x faster than existing approaches.

Publication date:
February 2009

Authors:
Andrew Leung
Minglong Shao
Timothy Bisson
Shankar Pasupathy
Ethan L. Miller

Projects:
Scalable File System Indexing
Ultra-Large Scale Storage

Available media

Full paper text: PDF

Bibtex entry

@inproceedings{leung-fast09,
  author       = {Andrew Leung and Minglong Shao and Timothy Bisson and Shankar Pasupathy and Ethan L. Miller},
  title        = {Spyglass: Fast, Scalable Metadata Search for Large-Scale Storage Systems},
  booktitle    = {Proceedings of the 7th USENIX Conference on File and Storage Technologies (FAST '09)},
  month        = feb,
  year         = {2009},
}
Last modified 28 May 2019