Architecture, operation, and dependability of large-scale Internet services: three case studies

IEEE Internet Computing(2002)

引用 81|浏览16
暂无评分
摘要
We describe the architecture and operational practices of three representative large-scale Internet services, and the causes of failure in two of them. We find convergence on a common architecture: division of nodes into service front-ends and back-ends, multiple levels of redundancy and load-balancing, and use of custom-written software for both production services and administrative tools. Operationally, we find a thin line between service developers and operators, and a need to coordinate problem detection and repair across administrative domains. Networking problems and operator error are the most significant contributors to failures in the systems we examined.
更多
查看译文
关键词
service architecture,system architecture,reliability,maintainability,dependability,internet service,internet,availability,load balance,front end
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要