Analytics at Scale: Evolution at Infrastructure and Algorithmic Levels

Mohammed Al-Kateb,Mohamed Y. Eltabakh, Awny Al-Omari, Paul G. Brown

2022 IEEE 38TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2022)(2022)

引用 0|浏览6
暂无评分
摘要
Data Analytics is at the core of almost all modern ap-plications ranging from science and finance to healthcare and web applications. The evolution of data analytics over the last decade has been dramatic - new methods, new tools and new platforms - with no slowdown in sight. This rapid evolution has pushed the boundaries of data analytics along several axis including scalability especially with the rise of distributed infrastructures and the Big Data era, and interoperability with diverse data management systems such as relational databases, Hadoop and Spark. However, many analytic application developers struggle with the challenge of production deployment. Recent experience suggests that it is difficult to deliver modern data analytics with the level of reliability, security and manageability that has been a feature of traditional SQL DBMSs. In this tutorial, we discuss the advances and innovations introduced at both the infrastructure and algorithmic levels, directed at making analytic workloads scale, while paying close attention to the kind of quality of service guarantees different technology provide. We start with an overview of the classical centralized analytical techniques, describing the shift towards distributed analytics over non-SQL infrastructures. We contrast such approaches with systems that integrate analytic functionality inside, above or adjacent to SQL engines. We also explore how Cloud platforms' virtualization capabilities make it easier - and cheaper - for end users to apply these new analytic techniques to their data. Finally, we conclude with the learned lessons and a vision for the near future.
更多
查看译文
关键词
component, formatting, style, styling, insert
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要