Streambox: Modern Stream Processing On A Multicore Machine

2017 USENIX ANNUAL TECHNICAL CONFERENCE (USENIX ATC '17)(2017)

引用 89|浏览93
暂无评分
摘要
Stream analytics on real-time events has an insatiable demand for throughput and latency. Its performance on a single machine is central to meeting this demand, even in a distributed system. This paper presents a novel stream processing engine called Stream Box that exploits the parallelism and memory hierarchy of modern multicore hardware. Stream Box executes a pipeline of transforms over records that may arrive out-of-order. As records arrive, it groups the records into ordered epochs delineated by watermarks. A watermark guarantees no subsequent record's event timestamp will precede it.Our contribution is to produce and manage abundant parallelism by generalizing out-of-order record processing within each epoch to out-of-order epoch processing and by dynamically prioritizing epochs to optimize latency. We introduce a data structure called cascading containers, which dynamically manages concurrency and dependences among epochs in the transform pipeline. Stream Box creates sequential memory layout of records in epochs and steers them to optimize NUMA locality. On a 56-core machine, Stream Box processes records up to 38 GB/sec (38M Records/sec) with 50 ms latency.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要