Impatience Is a Virtue: Revisiting Disorder in High-Performance Log Analytics

2018 IEEE 34th International Conference on Data Engineering (ICDE)(2018)

引用 6|浏览75
暂无评分
摘要
There is a growing interest in processing real-time queries over out-of-order streams in this big data era. This paper presents a comprehensive solution to meet this requirement. Our solution is based on Impatience sort, an online sorting technique that is based on an old technique called Patience sort. Impatience sort is tailored for incrementally sorting streaming datasets that present themselves as almost sorted, usually due to network delays and machine failures. With several optimizations, our solution can adapt to both input streams and query logic. Further, we develop a new Impatience framework that leverages Impatience sort to reduce the latency and memory usage of query execution, and supports a range of user latency requirements, without compromising on query completeness and throughput, while leveraging existing efficient in-order streaming engines and operators. We evaluate our proposed solution in Trill, a high-performance streaming engine, and demonstrate that our techniques significantly improve sorting performance and reduce memory usage - in some cases, by over an order of magnitude.
更多
查看译文
关键词
Sorting,streaming,out of order,disorder,performance
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要