谷歌浏览器插件
订阅小程序
在清言上使用

Resource Aware Scheduler for Distributed Stream Processing in Cloud Native Environments

Madushi Sarathchandra, Chulani Karandana, Winma Heenatigala,Miyuru Dayarathna,Sanath Jayasena

Concurrency and computation(2021)

引用 0|浏览15
暂无评分
摘要
Recently distributed stream processors are increasingly being deployed in cloud computing infrastructures. In this article, we study performance characteristics of distributed stream processing applications in Google Compute Engine which is based on Kubernetes. We identify performance gaps in terms of throughput which appear in such environments when using a round robin (RR) scheduling algorithm. As a solution, we propose resource aware stream processing scheduler called resource aware scheduler for stream processing applications in cloud native environments (RaspaCN). We implement RaspaCN's job scheduler using two-step process. First, we use machine learning to identify the optimal number of worker nodes. Second, we use RR and multiple Knapsack algorithms to produce performance optimal stream processing job schedules. With three application benchmarks called HTTP Log Processor, Nexmark, and Email Processor representing real world stream processing scenarios we evaluate the performance benefits obtained via RaspaCN's scheduling algorithm. RaspaCN could produce percentage increase of average throughput values by at least 37%, 38%, and 10%, respectively, for HTTP Log Processor, Nexmark, and Email Processor benchmarks for fixed input data rates. Furthermore, we conduct experiments with varying input data rates as well and show 7% improved average throughput for HTTP Log Processor. These experiments show the effectiveness of our proposed stream processor job scheduler for producing improved performance.
更多
查看译文
关键词
autoscaling,cloud computing,event&#8208,based systems,IaaS,Knapsack,Kubernetes,machine learning,scalability,software performance engineering,stream processing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要