Data Infrastructure at LinkedIn

Data Engineering（2012）

引用 91|浏览10

暂无评分

摘要

Linked In is among the largest social networking sites in the world. As the company has grown, our core data sets and request processing requirements have grown as well. In this paper, we describe a few selected data infrastructure projects at Linked In that have helped us accommodate this increasing scale. Most of those projects build on existing open source projects and are themselves available as open source. The projects covered in this paper include: (1) Voldemort: a scalable and fault tolerant key-value store, (2) Data bus: a framework for delivering database changes to downstream applications, (3) Espresso: a distributed data store that supports flexible schemas and secondary indexing, (4) Kafka: a scalable and efficient messaging system for collecting various user activity events and log data.

查看译文

关键词

downstream application,open source project,selected data infrastructure project,data bus,data infrastructure,log data,database change,core data set,data store,open source,fault tolerant key-value store,servers,distributed databases,public domain software,software fault tolerance,database indexing,espresso,indexes,routing,indexation,pipelines,fault tolerant,distributed data store

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要