Cloud based Real-Time and Low Latency Scientific Event Analysis

2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA)(2018)

引用 1|浏览36
暂无评分
摘要
Astronomy is well recognized as big data driven science. As the novel observation infrastructures are developed, the sky survey cycles have been shortened from a few days to a few seconds, causing data processing pressure to shift from offline to online. However, existing scientific databases focus on offline analysis of long-term historical data, not real-time and low latency analysis of large-scale newly arriving data. In this paper, a cloud based method is proposed to efficiently analyze scientific events on large-scale newly arriving data. The solution is implemented as a highly efficient system, namely Aserv. A set of compact data store and index structures are proposed to describe the proposed scientific events and a typical analysis pattern is formulized as a set of query operations. Domain aware filter, accuracy aware data partition, highly efficient index and frequently used statistical data designs are four key methods to optimize the performance of Aserv. Experimental results under the typical cloud environment show that the presented optimization mechanism can meet the low latency demand for both large data insertion and scientific event analysis. Aserv can insert 3.5 million rows of data within 3 seconds and perform the heaviest query on 6.7 billion rows of data also within 3 seconds. Furthermore, a performance model is given to help Aserv choose the right cloud resource setup to meet the guaranteed real-time performance requirement.
更多
查看译文
关键词
cloud resource setup,real-time performance requirement,low latency scientific event,novel observation infrastructures,sky survey cycles,data processing pressure,scientific databases focus,offline analysis,long-term historical data,low latency analysis,cloud based method,scientific events,highly efficient system,Aserv,compact data store,index structures,typical analysis pattern,domain aware filter,accuracy aware data partition,statistical data designs,typical cloud environment show,presented optimization mechanism,low latency demand,data insertion,scientific event analysis,Big Data
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要