Mini-Apps For High Performance Data Analysis

2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA)(2016)

引用 8|浏览78
摘要
Scaling-up scientific data analysis and machine learning algorithms for data-driven discovery is a grand challenge that we face today. Despite the growing need for analysis from science domains that are generating 'Big Data' from instruments and simulations, building high-performance analytical workflows of data-intensive algorithms have been daunting because: (i) the 'Big Data' hardware and software architecture landscape is constantly evolving, (ii) newer architectures impose new programming models, and (iii) dataparallel kernels of analysis algorithms and their performance facets on different architectures are poorly understood. To address these problems, we have: (i) identified scalable dataparallel kernels of popular data analysis algorithms, (ii) implemented 'Mini-Apps' of those kernels using different programming models (e. g. Map Reduce, MPI, etc.), (iii) benchmarked and validated the performance of the kernels in diverse architectures. In this paper, we discuss two of those MiniApps and show the execution of principal component analysis built as a workflow of the Mini-Apps. We show that Mini-Apps enable scientists to (i) write domain-specific data analysis code that scales on most HPC hardware and (ii) and offers the ability (most times with over a 10x speed-up) to analyze data sizes 100 times the size of what off-the-shelf desktop/workstations of today can handle.
更多
查看译文
关键词
Big Data, high performance data analytics, mini-apps, data analysis kernels, analytical motifs
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
0
您的评分 :

暂无评分

数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn