Trends and outlook for the massive-scale analytics stack

IBM Journal of Research and Development(2013)

引用 10|浏览1
暂无评分
摘要
Massive-scale analytics (MSA) applications are characterized by the large amount of data that they process and the complexity of algorithms used to process the data. The ideal MSA system will not only support processing of large amounts of data but also offer a high degree of parallelism and support scheduling and resource allocation of complex workloads. Designers of MSA systems must provide three necessities: programming abstractions, runtime systems, and hardware. Historically, two communities have undertaken the task of designing MSA systems: the database community, which has argued for an SQL (Structured Query Language)-influenced processing paradigm, and the high-performance computing community, which has focused on developing infrastructures for highly efficient, but complex, parallel implementations. These two communities have developed disparate technologies to meet the necessities of MSA systems, and the solutions provided by the individual communities are not completely satisfactory. In this paper, we attempt to characterize the strengths and weaknesses of the approaches of these two communities at all levels of the MSA stack, characterize implications with respect to resource management within the MSA system, and define how an MSA system should be designed.
更多
查看译文
关键词
database community,ideal msa system,massive-scale analytics,resource allocation,high-performance computing community,large amount,msa system,complex workloads,runtime system,resource management,individual community
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要