Metadata Traces and Workload Models for Evaluating Big Storage Systems
UCC(2012)
摘要
Efficient namespace metadata management is increasingly important as next-generation file systems are designed for peta and exascales. New schemes have been proposed, however, their evaluation has been insufficient due to a lack of appropriate namespace metadata traces. Specifically, no Big Data storage system metadata trace is publicly available and existing ones are a poor replacement. We studied publicly available traces and one Big Data trace from Yahoo! and note some of the differences and their implications to metadata management studies. We discuss the insufficiency of existing evaluation approaches and present a first step towards a statistical metadata workload model that can capture the relevant characteristics of a workload and is suitable for synthetic workload generation. We describe Mimesis, a synthetic workload generator, and evaluate its usefulness through a case study in a least recently used metadata cache for the Hadoop Distributed File System. Simulation results show that the traces generated by Mimesis mimic the original workload and can be used in place of the real trace providing accurate results.
更多查看译文
关键词
big storage systems,available trace,synthetic workload generation,synthetic workload generator,metadata cache,appropriate namespace metadata trace,statistical metadata workload model,workload models,metadata traces,metadata trace,big data trace,original workload,efficient namespace metadata management,big data,storage system,meta data,statistical analysis,internet,public domain software,metadata
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络