Quasar: A Scalable Naming Language for Very Large File Collections

Published as Storage Systems Research Center Technical Report UCSC-SSRC-08-04.

Abstract

As storage capacities increase, managing petabytes of data becomes increasingly challenging. One reason is the POSIX file system interface, originally designed in the 1970s in the context of file collections many orders of magnitude smaller than those found in today’s petabyte-scale storage systems. We show the scalability problems of the naming language imposed by POSIX, i.e. the language to identify an individual file or a group of files. We identify common features of popular applications that manage large file collections as search, attributes, and relationships. The increasing size of file collections has already motivated file system designers to include support for these features, so highly optimized implementations can be shared across all applications. Existing approaches treat these features as add-ons to the POSIX naming language. One consequence of this lack of integration is that searches cannot be scoped to a fragment of a file system name space, which makes search hard to scale to very large file collections. We present a naming language (Quasar) that offers operators for search and view specification within file systems. Quasar supports scope limiting by subtrees and by link distance. A Quasar name expands into a collection of Quasar names that represent a connected graph. We evaluate Quasar by contrasting its use with SQL and XPath in scenarios that are typical for very large file collections.

Publication date:
October 2008

Authors:
Sasha Ames
Carlos Maltzahn
Ethan L. Miller

Projects:
Scalable File System Indexing

Available media

Full paper text: PDF

Bibtex entry

@techreport{ames-ssrctr0804,
  author       = {Sasha Ames and Carlos Maltzahn and Ethan L. Miller},
  title        = {Quasar: A Scalable Naming Language for Very Large File Collections},
  institution  = {University of California, Santa Cruz},
  number       = {UCSC-SSRC-08-04},
  month        = oct,
  year         = {2008},
}
Last modified 5 Aug 2020