On brewing fresh espresso: LinkedIn's distributed data serving platform.

Lin Qiao,Kapil Surlaker,Shirshanka Das,Tom Quiggle,Bob Schulman,Bhaskar Ghosh,Antony Curtis, Oliver Seeliger,Zhen Zhang,Aditya Auradkar, Chris Beaver, Gregory Brandt,Mihir Gandhi,Kishore Gopalakrishna, Wai Ip,Swaroop Jagadish,Shi Lu, Alexander Pachev, Aditya Ramesh,Abraham Sebastian,Rupa Shanbhag,Subbu Subramaniam,Yun Sun,Sajid Topiwala,Cuong Tran,Jemiah Westerman,David Zhang

SIGMOD/PODS'13: International Conference on Management of Data New York New York USA June, 2013（2013）

引用 52|浏览154

暂无评分

摘要

Espresso is a document-oriented distributed data serving platform that has been built to address LinkedIn's requirements for a scalable, performant, source-of-truth primary store. It provides a hierarchical document model, transactional support for modifications to related documents, real-time secondary indexing, on-the-fly schema evolution and provides a timeline consistent change capture stream. This paper describes the motivation and design principles involved in building Espresso, the data model and capabilities exposed to clients, details of the replication and secondary indexing implementation and presents a set of experimental results that characterize the performance of the system along various dimensions. When we set out to build Espresso, we chose to apply best practices in industry, already published works in research and our own internal experience with different consistency models. Along the way, we built a novel generic distributed cluster management framework, a partition-aware change- capture pipeline and a high-performance inverted index implementation.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要