We find that monthly Volunteer Computing project costs range between 5K-12K, and startup costs range from 4K to 43K
Cost-benefit analysis of Cloud Computing versus desktop grids
IPDPS, pp.1-12, (2009)
Cloud Computing has taken commercial computing by storm. However, adoption of cloud computing platforms and services by the scientific community is in its infancy as the performance and monetary cost-benefits for scientific applications are not perfectly clear. This is especially true for desktop grids (aka volunteer computing) applicatio...更多
下载 PDF 全文
- Computational platforms have traditionally included clusters, and computational Grids.
- Two costefficient and powerful platforms have emerged, namely cloud and volunteer computing.
- Cloud computing platforms provide easy access to a company’s high-performance computing and storage infrastructure through web services.
- Cloud computing platforms provide massive scalability, 99.999% reliability, high performance, and specifiable configurability.
- These capabilities are provided at relatively low costs compared to dedicated infrastructures
- Computational platforms have traditionally included clusters, and computational Grids
- We examine and answer the following questions:
- We find that for a relatively small project such as XtremLab, one must have at least ∼1404 volunteer nodes and wait at least ∼4 days before the Volunteer Computing system becomes cheaper per FLOP than EC2
- We find that monthly Volunteer Computing project costs range between 5K-12K, and startup costs range from 4K to 43K
- We find that at least ∼1404 volunteer nodes are needed before Volunteer Computing becomes more cost effective in terms of cents per FLOP
- We examined the size of a cloud platform sustainable by Volunteer Computing costs
- The authors assume a replication factor of 3, which is quite conservative as projects such as World Community Grid  use levels 50% lower.
- The authors determined the cost-benefits of cloud computing versus volunteer computing applications.
- The authors calculated VC overheads for platform construction, application deployment, compute rates, and completion times.
- The authors found that in the best-case scenario, hosts register at a rate of 124 cloud nodes per day.
- The authors found that the ratio of volunteer nodes needed to achieve the compute power of a small EC2 instance is about 2.83 active volunteer hosts to 1.
- The authors detailed the specific costs of a large and small VC project.
- If cloud computing systems are to replace VC platforms, payper-use costs would have to decrease by at least an order of magnitude
- Table1: Pricing for EC2 Instances
- Table2: Pricing for EC2 Data Transfer
- Table3: Pricing for EBS
- Table4: Project Costs (monthly)
- Table5: Project Resource Usage described in Section 5) is as follows. We assume the Scheduler and File Upload Handler execute over EC2. We assume the BOINC database is hosted on EBS. We assume the storage for uploads, downloads, and science results is stored on S3
- In , the authors consider the Amazon data storage service S3 for scientific data-intensive applications. They conclude that monetary costs are high as the storage service groups availability, durability, and access performance together. By contrast, data-intensive applications often do not always need all of these three features at once. In , the authors determine the performance of MPI applications over Amazon’s EC2. They find that the performance for MPI distributed-memory parallel programs and OpenMP shared-memory parallel programs over the cloud is significantly worse than in "out-of-cloud" clusters. In , the author conducts a general cost-benefit analysis of clouds. However, no specific type of scientific application is considered. In , the authors determine the cost of running a scientific workflow over a cloud. They find that the computational costs outweighed storage costs for their Montage application. By contrast, for comparison, we consider a different type of application (namely batches of embarrassingly parallel and compute-intensive tasks) and costeffective platform consisting of volunteered resources.
- The project is funded by NSF and is based at the UC Berkeley Space Sciences Laboratory
- D. Anderson. Boinc: A system for public-resource computing and storage. In Proceedings of the 5th IEEE/ACM International Workshop on Grid Computing, Pittsburgh, USA, 2004.
- Artur Andrzejak, Derrick Kondo, and David P. Anderson. Ensuring collective availability in volatile resource pools via forecasting. In DSOM, pages 149– 161, 2008.
- Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Timothy L. Harris, Alex Ho, Rolf Neugebauer, Ian Pratt, and Andrew Warfield. Xen and the art of virtualization. In SOSP, pages 164–177, 2003.
- J. Bezos. Amazon.com: Amazon EC2, Amazon Elastic Compute Cloud, Virtual Grid Computing: Amazon Web Services. http://www.amazon.com/gp/browse.html?node=201590011.
- BOINC Papers. http://boinc.berkeley.edu/trac/wiki/BoincPapers.
- Catalog of boinc projects. http://boinc-wiki.ath.cx/index.php?title=Catalog_of_ BOINC_Powered_Projects.
- Attila Csaba Marosi, Peter Kacsuk, Gilles Fedak, and Oleg Lodygensky. Using virtual machines in desktop grid clients for application sandboxing. Technical Report TR-0140, Institute on Architectural Issues: Scalability, Dependability, Adaptability, CoreGRID Network of Excellence, August 2008.
- Cycle Computing Inc..
- E. Deelman, S. Gurmeet, M. Livny, J. Good, and B. Berriman. The Cost of Doing Science in the Cloud: The Montage Example. In Proc. of Supercomputing’08, Austin, 2008.
- IO statistics fields. http://devresources.linux-foundation.org/dev/robustmutexes/src/fusyn.hg/ Documentation/iostats.txt.
- How to Deploy a BOINC server on the Amazon Elastic Compute Cloud. http://boinc.berkeley.edu/trac/wiki/CloudServer.
- Excel file for EC2 costs. http://mescal.imag.fr/membres/derrick.kondo/cloud_calc.xlsx.
- Amazon EC2 pricing. http://aws.amazon.com/ec2/faqs.
- T. Estrada, D. Flores, M. Taufer, P. Teller, A. Kerstens, and D. Anderson. The Effectiveness of Thresholdbased Scheduling Policies in BOINC Projects. In Proceedings of the 2st IEEE International Conference on e-Science and Grid Technologies (eScience 2006), December 2006.
- Folding@home Papers. http://folding.stanford.edu/English/Papers.
- Arijit Ganguly, Abhishek Agrawal, P. Oscar Boykin, and Renato J. O. Figueiredo. Wow: Self-organizing wide area overlay networks of virtual workstations. J. Grid Comput., 5(2):151–172, 2007.
- Simson Garfinkel. Commodity grid computing with amazons s3 and ec2. In login, 2007.
- Eric Martin Heien, David P. Anderson, and Kenichi Hagihara. Computing Low Latency Batches with Unreliable Workers in Volunteer Computing Environments. 2008. under submission.
- Eric Martin Heien, Noriyuki Fujimoto, and Kenichi Hagihara. Computing low latency batches with unreliable workers in volunteer computing environments. In Workshop on Volunteer Computing and Desktop Grids (PCGrid), pages 1–8, 2008.
- D. Kondo, M. Taufer, C. Brooks, H. Casanova, and A. Chien. Characterizing and Evaluating Desktop Grids: An Empirical Study. In Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS’04), April 2004.
- Derrick Kondo, David P. Anderson, and John McLeod. Performance evaluation of scheduling policies for volunteer computing. In eScience, pages 415– 422, 2007.
- P. Malecot, D. Kondo, and G. Fedak. Xtremlab: A system for characterizing internet desktop grids (abstract). In in Proceedings of the 6th IEEE Symposium on High-Performance Distributed Computing, 2006.
- Mayur Palankar, Adriana Iamnitchi, Matei Ripeanu, and Simson Garfinkel. Amazon S3 for Science Grids: a Viable Solution? In Data-Aware Distributed Computing Workshop (DADC), 2008.
- K. Reed. Personal communication, 2008.
- Ben Segal. Personal communication, June 2008.
- Boinc stats for seti. http://boincstats.com/stats/project_graph.php?pr=sah\&view=hosts.
- Vijay Pande. Private communication, 2004.
- Edward Walker. Benchmarking Amazon EC2 for high-performance scientific computing. In USENIX LOGIN, 2008.
- World community grid.
- Cloud Computing. http://en.wikipedia.org/wiki/Cloud_computing. Derrick Kondo is a research scientist at INRIA RhoneAlpes Grenoble. He received his Bachelor’s in Computer Science at Stanford University, and his Master’s and Ph.D. in Computer Science from the University of California at San Diego. He was an INRIA post-doctoral fellow at LRI (Computer Research Laboratory) at the University of Paris-Sud. He founded and serves as Chair/Co-Chair of the Workshop on Volunteer Computing and Desktop Grids (PCGrid). He is co-guest editor of a special issue of the Journal of Grid Computing on desktop grids. His research interests include the characterization, modelling, and simulation of large-scale distributed computing systems, and scheduling mechanisms and algorithms for volatile resources.