Fast, Transparent, and High-Fidelity Memoization Cache-Keys for Computational Workflows

2022 IEEE International Conference on Services Computing (SCC)(2022)

引用 3|浏览5
暂无评分
摘要
Computational workflows are important methods for automating complex data-generation and analysis pipelines. Workflows are composed of sub-graphs that perform specific tasks. Certain sub-graphs may appear in multiple workflows. This implies that the same task, with the same input, may execute multiple times thereby wasting computational resources. Additionally, hybrid cloud environments spanning across on-premises and public cloud deployments are increasingly popular. Identical tasks may not only execute multiple times but also on multiple heterogeneous environments. Memoization, a technique to reduce the number of redundant task executions, can tackle this problem. There is a significant body of work for the memoization of workflows. However, prior studies have only been able to satisfy at most two of the three desired features of workflow memoization: transparency (leveraged automatically), performance, and fidelity (identify desired redundant tasks). This paper first introduces a taxonomy for understanding memoization. Next, we describe a novel algorithm that uses a walk in the workflow graph to transparently generate fast and high-fidelity memoization cache-keys, overcoming limitations in prior approaches. We evaluate our scheme using three industry motivated quantum chemistry workflows and two geo-distributed OpenShift clusters. Our experimental evaluation results show the implemented scheme is fast (O(ms)) and can remove nontrivial sources of false positives and negatives that other methods may generate. Finally, we illustrate the proposed workflow memoization scheme can reduce the duration of our benchmarks by up to 10.55x nearly matching the theoretical maximum speedup of memoization for these workloads (10.69x).
更多
查看译文
关键词
computational workflows,memoization,hybrid cloud,workflow execution
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要