CRSS publication: Efficiently Identifying Working Sets in Block I/O Streams

Efficiently Identifying Working Sets in Block I/O Streams

Appeared in Proceedings of the 4th Annual International Systems and Storage Conference (SYSTOR 2011).

Abstract

Identifying groups of blocks that tend to be read or written together in a given environment is the first step towards powerful techniques for device failure isolation and power management. For example, identified groups can be placed together on a single disk, avoiding excess drive activity across an exascale storage system. Unlike previous grouping work, we focus on identifying groupings in data that can be gathered from real, running systems with minimal impact. Using temporal, spatial, and access ordering information from an enterprise data set, we identified a set of groupings that consistently appear, indicating that these are working sets that are likely to be accessed together. We present several techniques to obtain groupings along with a discussion of what techniques best apply to particular types of real systems. We intend to use these preliminary results to inform our search for new types of workloads with a goal of identifying properties of easily separable workloads across different systems and dynamically moving groups in these workloads to reduce disk activity in large storage systems.

Publication date:
May 2011

Authors:
Avani Wildani
Lee Ward
Ethan L. Miller

Projects:
Reliable Storage
Prediction and Grouping

Available media

Full paper text: PDF

Bibtex entry

@inproceedings{wildani-systor11,
  author       = {Avani Wildani and Lee Ward and Ethan L. Miller},
  title        = {Efficiently Identifying Working Sets in Block I/O Streams},
  booktitle    = {Proceedings of the 4th Annual International Systems and Storage Conference (SYSTOR 2011)},
  month        = may,
  year         = {2011},
}

Last modified 5 Aug 2020