Towards A Scalable Distributed Fitness Evaluation Service

PARALLEL PROCESSING AND APPLIED MATHEMATICS, PPAM 2015, PT I(2016)

引用 5|浏览3
暂无评分
摘要
Organizations across the globe gather more and more data. Large datasets require new approaches to analysis and processing, which include methods based on machine learning. In particular, the symbolic regression can provide many useful insights. Unfortunately, due to high resource requirements, the use of this method for large datasets might be unfeasible. In this paper we analyze a bottleneck in an open-source implementation of this method, we call hubert. We identify that the evaluation of individuals is the most costly operation. As a solution to this problem, we propose a new evaluation service based on the Apache Spark framework, which attempts to speed up computations by distributing them on a cluster of machines. We compare the performance of the service by analyzing the execution time for a number of samples with use of both implementations. Then we discuss how the computation time improves with increased amount of resources. Finally we draw conclusions and outline plans for further research.
更多
查看译文
关键词
Distributed systems,Evolutionary programming,Symbolic regression,Scaling,Apache spark
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要