Enhancing reproducibility for computational methods

science, Volume 354, Issue 6317, 2016, Pages 1240-1241.

Cited by: 209|Views25
WOS SCIENCE
Weibo:
We present a novel set of Reproducibility Enhancement Principles targeting disclosure challenges involving computation

Abstract:

Over the past two decades, computational methods have radically changed the ability of researchers from all areas of scholarship to process and analyze data and to simulate complex systems. But with these advances come challenges that are contributing to broader concerns over irreproducibility in the scholarly literature, among them the l...More

Code:

Data:

Full Text
Bibtex
Weibo
Introduction
  • By Victoria Stodden,1 Marcia McNutt,2 David H.
  • Access to the data and code that underlie discoveries can enable downstream scientific contributions, such as meta-analyses, reuse, and other efforts that include results from multiple studies.
  • RECOMMENDATIONS Share data, software, workflows, and details of the computational environment that generate published findings in open trusted repositories.
Highlights
  • By Victoria Stodden,1 Marcia McNutt,2 David H
  • With these advances come challenges that are contributing to broader concerns over irreproducibility in the scholarly literature, among them the lack of transparency in disclosure of computational methods
  • We present a novel set of Reproducibility Enhancement Principles (REP) targeting disclosure challenges involving computation
  • These recommendations, which build upon more general proposals from the Transparency and Openness Promotion (TOP) guidelines [1] and recommendations for field data [2], emerged from workshop discussions among funding agencies, publishers and journal editors, industry participants, and researchers representing a broad range of domains
  • As author-generated code and workflows fall under copyright, and data may as well, we recommend using the Reproducible Research Standard (RRS) to maximize utility to the community and to enable verification of findings [11]
  • Journals should conduct a reproducibility check as part of the publication process and should enact the Transparency and Openness Promotion standards at level 2 or 3. Such a check asks whether the data, code, and computational steps upon which findings depend are available in an open trusted repository in a discoverable and persistent way, with links provided in the publication
Results
  • The minimal components that enable independent regeneration of computational results are the data, the computational steps that produced the findings, and the workflow describing how to generate the results using the data and code, including parameter settings, random number seeds, make files, or function invocation sequences [8, 9].
  • Persistent links should appear in the published article and include a permanent identifier for data, code, and digital artifacts upon which the results depend.
  • The authors recommend digital object identifiers (DOIs) so that it is possible to discover related data sets and code through the DOI structure itself, for example, using a hierarchical schema.
  • Code, and workflows, including software written by the authors, should be cited in the references section [10].
  • As author-generated code and workflows fall under copyright, and data may as well, the authors recommend using the Reproducible Research Standard (RRS) to maximize utility to the community and to enable verification of findings [11].
  • The RRS and principles of open licensing should be clearly explained to authors by journals, to ensure long-term open access to digital scholarly artifacts.
  • Such a check asks whether the data, code, and computational steps upon which findings depend are available in an open trusted repository in a discoverable and persistent way, with links provided in the publication.
Conclusion
  • Could the published computational findings be reproduced on an independent system by using the data and code provided?
  • Topics might include methods for verifying queries on confidential data; extending validation, verification, and uncertainty quantification to encompass reproducibility; numerical reproducibility and sensitivity to small variations in computation [14]; testing standards for code, including closed or proprietary codes; cyberinfrastructure that supports reproducibility, as well as innovative computational work; pilot efforts to create “instruction manuals” for manuscript submission; policy research on intellectual property law and software patenting; costs and benefits to reproducibility in different settings, for example, in industry collaboration; provenance and workflow repositories; and exploring how to make investments regarding the preservation of various digital artifacts.
  • The authors believe that as these efforts become commonplace, practices and tools will continue to emerge that reduce the amount of time and resource investment necessary to facilitate reproducibility and support increasingly ambitious computational research.
Summary
  • By Victoria Stodden,1 Marcia McNutt,2 David H.
  • Access to the data and code that underlie discoveries can enable downstream scientific contributions, such as meta-analyses, reuse, and other efforts that include results from multiple studies.
  • RECOMMENDATIONS Share data, software, workflows, and details of the computational environment that generate published findings in open trusted repositories.
  • The minimal components that enable independent regeneration of computational results are the data, the computational steps that produced the findings, and the workflow describing how to generate the results using the data and code, including parameter settings, random number seeds, make files, or function invocation sequences [8, 9].
  • Persistent links should appear in the published article and include a permanent identifier for data, code, and digital artifacts upon which the results depend.
  • The authors recommend digital object identifiers (DOIs) so that it is possible to discover related data sets and code through the DOI structure itself, for example, using a hierarchical schema.
  • Code, and workflows, including software written by the authors, should be cited in the references section [10].
  • As author-generated code and workflows fall under copyright, and data may as well, the authors recommend using the Reproducible Research Standard (RRS) to maximize utility to the community and to enable verification of findings [11].
  • The RRS and principles of open licensing should be clearly explained to authors by journals, to ensure long-term open access to digital scholarly artifacts.
  • Such a check asks whether the data, code, and computational steps upon which findings depend are available in an open trusted repository in a discoverable and persistent way, with links provided in the publication.
  • Could the published computational findings be reproduced on an independent system by using the data and code provided?
  • Topics might include methods for verifying queries on confidential data; extending validation, verification, and uncertainty quantification to encompass reproducibility; numerical reproducibility and sensitivity to small variations in computation [14]; testing standards for code, including closed or proprietary codes; cyberinfrastructure that supports reproducibility, as well as innovative computational work; pilot efforts to create “instruction manuals” for manuscript submission; policy research on intellectual property law and software patenting; costs and benefits to reproducibility in different settings, for example, in industry collaboration; provenance and workflow repositories; and exploring how to make investments regarding the preservation of various digital artifacts.
  • The authors believe that as these efforts become commonplace, practices and tools will continue to emerge that reduce the amount of time and resource investment necessary to facilitate reproducibility and support increasingly ambitious computational research.
Funding
  • These recommendations emerged from a workshop held at the American Association for the Advancement of Science (AAAS), Washington, DC, 16 and 17 February 2016, funded by the Laura and John Arnold Foundation (http://bit.ly/AAAS2016Arnold)
Reference
  • AND NOTES B. A. Nosek et al., Science 348, 1422 (2015).
    Google ScholarLocate open access versionFindings
  • M. McNutt et al., Science 351, 1024 (2016).
    Google ScholarLocate open access versionFindings
  • A. A. Alsheikh-Ali et al., PLOS ONE 6, e24357 (2011).
    Google ScholarLocate open access versionFindings
  • D. Donoho et al., IEEE Comput. Sci. Eng, 11, 8 (2009).
    Google ScholarLocate open access versionFindings
  • V. Stodden, IMS Bull. Online, 17 November (2013); http://bit.ly/BullIMStat2013.
    Locate open access versionFindings
  • D. H. Bailey, J. M. Borwein, V. Stodden, Notices Amer. Math. Soc. 60 (6), 679 (2013).
    Google ScholarLocate open access versionFindings
  • D. Garijo et al., PLOS ONE 8, e80278 (2013).
    Google ScholarLocate open access versionFindings
  • D. Donoho, V. Stodden, in The Princeton Companion to Applied Mathematics. N. J. Higham, Ed. (Princeton Univ. Press, Princeton, NJ, 2016), pp. 916–925.
    Google ScholarFindings
  • R. Gentleman, D. Temple Lang, J. Comput. Graph. Stat. 16, 1 (2007).
    Google ScholarLocate open access versionFindings
  • V. Stodden, S. Miguez, J. Open Res. Softw. 2, e21 (2014).
    Google ScholarLocate open access versionFindings
  • V. Stodden, Comput. Sci. Eng. 11, 35 (2009).
    Google ScholarFindings
  • V. Stodden, P. Guo, Z. Ma, PLOS ONE 8, e67111 (2013).
    Google ScholarLocate open access versionFindings
  • M. Heroux, ACM Trans. Math. Softw. 41(3), art13 (2015).
    Google ScholarLocate open access versionFindings
  • D. H. Bailey, J. M. Borwein, V. Stodden, in Reproducibility: Principles, Problems, Practices, H. Atmanspacher and S. Maasen, Eds. (Wiley, New York, 2015), pp. 205–232.
    Google ScholarFindings
  • M. Fuentes, AMSTAT News, July 2016; http://bit.ly/ JASA2gb.
    Findings
  • R. J. LeVeque, SIAM News 46, April 2013.
    Google ScholarFindings
Your rating :
0

 

Tags
Comments