Efficient Execution of Scientific Workflows in the Cloud Through Adaptive Caching.

Trans. Large Scale Data Knowl. Centered Syst.(2020)

引用 1|浏览29
暂无评分
摘要
Many scientific experiments are now carried on using scientific workflows, which are becoming more and more data-intensive and complex. We consider the efficient execution of such workflows in the cloud. Since it is common for workflow users to reuse other workflows or data generated by other workflows, a promising approach for efficient workflow execution is to cache intermediate data and exploit it to avoid task re-execution. In this paper, we propose an adaptive caching solution for data-intensive workflows in the cloud. Our solution is based on a new scientific workflow management architecture that automatically manages the storage and reuse of intermediate data and adapts to the variations in task execution times and output data size. We evaluated our solution by implementing it in the OpenAlea system and performing extensive experiments on real data with a data-intensive application in plant phenotyping. The results show that adaptive caching can yield major performance gains, e.g., up to a factor of 3.5 with 6 workflow re-executions.
更多
查看译文
关键词
scientific workflows,cloud through adaptive caching,efficient execution
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要