Engineering Scalable Distributed Services for Real-Time Big Data Analytics

2017 IEEE Third International Conference on Big Data Computing Service and Applications (BigDataService)(2017)

引用 6|浏览32
暂无评分
摘要
There is high demand for tools that analyze large sets of streaming data in both industrial and academic settings. While existing work has examined a wide range of issues, we focus on query support. In particular, we focus on providing analysts flexibility with respect to the types of queries they can make on large data sets in real time as well as over historical data. We have designed and implemented a lightweight service-based framework-EPIC Real-Time-that manages a set of queries that can be applied to user-initiated data analysis events (such as studying tweets generated during a disaster). Our prototype combines stream processing and batch processing techniques inspired by the Lambda Architecture. We investigate a core set of query types that can answer a wide range of queries asked by analysts who study crisis events. In this paper, we present a prototype implementation of EPIC Real-Time which makes use of event-driven and reactive programming techniques. We also present a performance evaluation on how efficiently the real-time and batch-oriented queries perform, how well these queries meet the needs of our analysts, and provide insight into how EPIC Real-Time performs along a number of dimensions including performance, usability, scalability, and reliability.
更多
查看译文
关键词
social media analysis,lambda architecture,query support,crisis informatics
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要