Hidden Technical Debt in Machine Learning Systems
Annual Conference on Neural Information Processing Systems, 2015.
EI
Weibo:
Abstract:
Machine learning offers a fantastically powerful toolkit for building useful complex prediction systems quickly. This paper argues it is dangerous to think of these quick wins as coming for free. Using the software engineering framework of technical debt, we find it is common to incur massive ongoing maintenance costs in real-world ML sys...More
Code:
Data:
Introduction
- As the machine learning (ML) community continues to accumulate years of experience with live systems, a wide-spread and uncomfortable trend has emerged: developing and deploying ML systems is relatively fast and cheap, but maintaining them over time is difficult and expensive
- This dichotomy can be understood through the lens of technical debt, a metaphor introduced by Ward Cunningham in 1992 to help reason about the long term costs incurred by moving quickly in software engineering.
Highlights
- As the machine learning (ML) community continues to accumulate years of experience with live systems, a wide-spread and uncomfortable trend has emerged: developing and deploying ML systems is relatively fast and cheap, but maintaining them over time is difficult and expensive. This dichotomy can be understood through the lens of technical debt, a metaphor introduced by Ward Cunningham in 1992 to help reason about the long term costs incurred by moving quickly in software engineering
- We argue that ML systems have a special capacity for incurring technical debt, because they have all of the maintenance problems of traditional code plus an additional set of ML-specific issues
- We have found that data dependencies in ML systems carry a similar capacity for building debt, but may be more difficult to detect
- Technical debt is a useful metaphor, but it does not provide a strict metric that can be tracked over time
- We examine several ways that the resulting erosion of boundaries may significantly increase technical debt in ML systems
- Perhaps the most important insight to be gained is that technical debt is an issue that engineers and researchers both need to be aware of
Results
- The authors examine several ways that the resulting erosion of boundaries may significantly increase technical debt in ML systems.
Conclusion
- Measuring Debt and Paying it Off
Technical debt is a useful metaphor, but it does not provide a strict metric that can be tracked over time. - The authors hope that this paper may serve to encourage additional development in the areas of maintainable ML, including better abstractions, testing methodologies, and design patterns.
- Perhaps the most important insight to be gained is that technical debt is an issue that engineers and researchers both need to be aware of.
- Research solutions that provide a tiny accuracy benefit at the cost of massive increases in system complexity are rarely wise practice.
- Even the addition of one or two seemingly innocuous data dependencies can slow further progress
Summary
Introduction:
As the machine learning (ML) community continues to accumulate years of experience with live systems, a wide-spread and uncomfortable trend has emerged: developing and deploying ML systems is relatively fast and cheap, but maintaining them over time is difficult and expensive- This dichotomy can be understood through the lens of technical debt, a metaphor introduced by Ward Cunningham in 1992 to help reason about the long term costs incurred by moving quickly in software engineering.
Results:
The authors examine several ways that the resulting erosion of boundaries may significantly increase technical debt in ML systems.Conclusion:
Measuring Debt and Paying it Off
Technical debt is a useful metaphor, but it does not provide a strict metric that can be tracked over time.- The authors hope that this paper may serve to encourage additional development in the areas of maintainable ML, including better abstractions, testing methodologies, and design patterns.
- Perhaps the most important insight to be gained is that technical debt is an issue that engineers and researchers both need to be aware of.
- Research solutions that provide a tiny accuracy benefit at the cost of massive increases in system complexity are rarely wise practice.
- Even the addition of one or two seemingly innocuous data dependencies can slow further progress
Funding
- We examine several ways that the resulting erosion of boundaries may significantly increase technical debt in ML systems
Reference
- R. Ananthanarayanan, V. Basker, S. Das, A. Gupta, H. Jiang, T. Qiu, A. Reznichenko, D. Ryabkov, M. Singh, and S. Venkataraman. Photon: Fault-tolerant and scalable joining of continuous data streams. In SIGMOD ’13: Proceedings of the 2013 international conference on Management of data, pages 577– 588, New York, NY, USA, 2013.
- A. Anonymous. Machine learning: The high-interest credit card of technical debt. SE4ML: Software Engineering for Machine Learning (NIPS 2014 Workshop).
- L. Bottou, J. Peters, J. Quinonero Candela, D. X. Charles, D. M. Chickering, E. Portugaly, D. Ray, P. Simard, and E. Snelson. Counterfactual reasoning and learning systems: The example of computational advertising. Journal of Machine Learning Research, 14(Nov), 2013.
- W. J. Brown, H. W. McCormick, T. J. Mowbray, and R. C. Malveau. Antipatterns: refactoring software, architectures, and projects in crisis. 1998.
- T. M. Chilimbi, Y. Suzue, J. Apacible, and K. Kalyanaraman. Project adam: Building an efficient and scalable deep learning training system. In 11th USENIX Symposium on Operating Systems Design and Implementation, OSDI ’14, Broomfield, CO, USA, October 6-8, 2014., pages 571–582, 2014.
- B. Dalessandro, D. Chen, T. Raeder, C. Perlich, M. Han Williams, and F. Provost. Scalable handsfree transfer learning for online advertising. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1573–1582. ACM, 2014.
- M. Fowler. Code smells. http://http://martinfowler.com/bliki/CodeSmell.html.
- M. Fowler. Refactoring: improving the design of existing code. Pearson Education India, 1999.
- J. Langford and T. Zhang. The epoch-greedy algorithm for multi-armed bandits with side information. In Advances in neural information processing systems, pages 817–824, 2008.
- M. Li, D. G. Andersen, J. W. Park, A. J. Smola, A. Ahmed, V. Josifovski, J. Long, E. J. Shekita, and B. Su. Scaling distributed machine learning with the parameter server. In 11th USENIX Symposium on Operating Systems Design and Implementation, OSDI ’14, Broomfield, CO, USA, October 6-8, 2014., pages 583–598, 2014.
- J. Lin and D. Ryaboy. Scaling big data mining infrastructure: the twitter experience. ACM SIGKDD Explorations Newsletter, 14(2):6–19, 2013.
- H. B. McMahan, G. Holt, D. Sculley, M. Young, D. Ebner, J. Grady, L. Nie, T. Phillips, E. Davydov, D. Golovin, S. Chikkerur, D. Liu, M. Wattenberg, A. M. Hrafnkelsson, T. Boulos, and J. Kubica. Ad click prediction: a view from the trenches. In The 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2013, Chicago, IL, USA, August 11-14, 2013, 2013.
- J. D. Morgenthaler, M. Gridnev, R. Sauciuc, and S. Bhansali. Searching for build debt: Experiences managing technical debt at google. In Proceedings of the Third International Workshop on Managing Technical Debt, 2012.
- D. Sculley, M. E. Otey, M. Pohl, B. Spitznagel, J. Hainsworth, and Y. Zhou. Detecting adversarial advertisements in the wild. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, CA, USA, August 21-24, 2011, 2011.
- Securities and E. Commission. SEC Charges Knight Capital With Violations of Market Access Rule, 2013.
- A. Spector, P. Norvig, and S. Petrov. Google’s hybrid approach to research. Communications of the ACM, 55 Issue 7, 2012.
- A. Zheng. The challenges of building machine learning tools for the masses. SE4ML: Software Engineering for Machine Learning (NIPS 2014 Workshop).
Full Text
Tags
Comments