AI帮你理解科学

AI 生成解读视频

AI抽取解析论文重点内容自动生成视频


pub
生成解读视频

AI 溯源

AI解析本论文相关学术脉络


Master Reading Tree
生成 溯源树

AI 精读

AI抽取本论文的概要总结


微博一下
The juxtaposition of health care and computing can only result in their joint application, which promises to dramatically enhance the effectiveness and efficiency of health care

Some experiences and opportunities for big data in translational research.

GENETICS IN MEDICINE, no. 10 (2013): 802.0-809

被引用71|浏览17
WOS
下载 PDF 全文
引用
微博一下

摘要

Health care has become increasingly information intensive. The advent of genomic data, integrated into patient care, significantly accelerates the complexity and amount of clinical data. Translational research in the present day increasingly embraces new biomedical discovery in this data-intensive world, thus entering the domain of "big d...更多

代码

数据

0
简介
  • The notion of big data currently captures the imagination, it defies simple characterization.
  • Big data sources include genetic/genomic data collected during the course of patient care, public health reporting, and research.
重点内容
  • The notion of big data currently captures our imagination, it defies simple characterization
  • A recent textbook on the topic asserts that “big data refers to things one can do at a larger scale that cannot be done at a smaller one, to extract new insights or create new forms of value, in ways that change markets, organizations, the relationship between citizens and governments, and more.”[1] More disturbingly, that same text goes on to challenge conventional notions of inference and scientific proof, suggesting that “society will need to shed some of its obsession for causality in exchange for simple correlations: not knowing why but only what
  • We provide some context for big data, outline some relevant experiences from our Electronic Medical Records and Genomics consortium,[2,3] address some general aspects of translational data such as those used within eMERGE, highlight a promising development in computer science (Hadoop),[4] and propose a general strategy around comparable and consistent data to mitigate heterogeneity and misclassification
  • Big data sources include genetic/genomic data collected during the course of patient care, public health reporting, and research
  • The juxtaposition of health care and computing can only result in their joint application, which promises to dramatically enhance the effectiveness and efficiency of health care
  • There is convergence in both the genomic and the clinical phenotyping worlds, driven by the application of genomics to clinical practice and Meaningful Use. These promise to make good on the promise of higher quality and lower cost health care in an information-driven industry
结果
  • CHUTE et al | Big data in translational research reporting provide a general format for narrative molecular reports;[21,22,23] reporting checklists (e.g., College of American Pathologists checklists) provide a more robust standard for synoptic reporting.[24] The HL7 Clinical Genomics Workgroup has extended established laboratory reporting standards, commonly used for other clinical laboratory tests, to inclusion of structured genetic findings, references, and interpretations, as well as the narrative report.[25,26] In the very early stages of adoption, as reporting systems advance more quickly than receiving systems, a common first step is inclusion of these vocabulary guidelines in molecular reports.
  • There remain several important gaps within clinical genomics standards for health care, including (i) coding of important cancer biomarkers and association with causal DNA variants and (ii) standards for transmission of genomics data within the health-care environment.[15] As the field moves to using highthroughput NGS to perform genetic-based clinical tests, robust concept mapping from observed DNA sequence variation to reported biomarkers is critical.
  • Standardization of genomic data representation was identified as a gap by the Centers for Disease Control and Prevention-sponsored Generation Sequencing: Standardization of Clinical Testing Working Group.[32] In follow-up, a federally mediated workgroup including representatives from the Centers for Disease Control and Prevention, the National Center for Biotechnology Information (NCBI), the National Institute of Standards and Technology, and the Food and Drug Administration, as well as experts from the clinical laboratory, bioinformatics, and health-care IT standards community, has been formed to develop a clinical grade VCF/gVCF1 file format in preparation for broad adoption of NGS.
  • Specifications for this vocabulary are currently detailed in the HL7 Version 2 Implementation Guide for Genetic Variation.[25] these vocabularies are in the process of being extended for whole-genome or exome sequencing within the context of a broad set of clinical workflow scenarios, under the “Clinical Sequencing” project.
  • The implementation guide describes how to construct a data message for genetic test results using many of the standards previously mentioned (LOINC, Human Gene Nomenclature Committee, Human Genome Variation Society, etc.).
  • The guide is titled HL7 Version 2 Implementation Guide: Clinical Genomics; Fully LOINC-Qualified Genetic Variation Model, Release 2 (US Realm).[25] Example messages in the guide include genetic disease analysis, pharmacogenomic-based drug metabolism, and drug efficacy.
结论
  • The safest and most promising application of big data in health care will be driven by clinical and genomic data that are generated with or transformed into standards-based representations, to ensure comparability and consistency.
  • These promise to make good on the promise of higher quality and lower cost health care in an information-driven industry
基金
  • We are grateful for the grant support in part from the National Human Genome Research Institute as eMERGE consortium members, specifically U01-HG06379 (Mayo Clinic) and U01-HG006389 (Marshfield Clinic)
研究对象与分析
academic medical centers: 5
However, the recognition that the rate-limiting step for genotype-to-phenotype associations was resoundingly on the phenotyping side prompted the National Human Genome Research Institute to propose the eMERGE consortium.2,3. Initially a cooperative agreement among five academic medical centers, the eMERGE consortium balanced biobanking and genotyping with the development of a scalable capacity to execute “high-throughput phenotyping” of patient cohorts using electronic medical records.7. To ensure reproducibility and portability of these algorithms across medical centers, the consortium embraced their mapping to health information technology (HIT) standards,[8] conformant with Meaningful Use specifications.[9]

引用论文
  • Mayer-Schönberger V, Cukier K. Big Data: A Revolution That Will Transform How We Live, Work, and Think. Houghton Mifflin Harcourt: Boston, MA, 2013:242.
    Google ScholarFindings
  • The eMERGE Network. 2010..https://www.mc.vanderbilt.edu/victr/dcc/projects/acc/index.php/Main_Page. Accessed 20 January 2010.
    Findings
  • McCarty CA, Chisholm RL, Chute CG, et al.; eMERGE Team. The eMERGE Network: a consortium of biorepositories linked to electronic medical records data for conducting genomic studies. BMC Med Genomics 2011;4:13.
    Google ScholarLocate open access versionFindings
  • Apache Foundation. Hadoop. http://hadoop.apache.org/. Accessed 9 April 2013.
    Findings
  • Sloan Digital Sky Survey. The Scope of the Ninth SDSS Data Release (DR9). http://www.sdss3.org/dr9/scope.php. Accessed 2 April 2013.
    Findings
  • Geiger H, Marsden E. On a diffuse reflection of the α-particles. Proc Royal Soc Series A 1909;82:495–500.
    Google ScholarLocate open access versionFindings
  • Kullo IJ, Fan J, Pathak J, Savova GK, Ali Z, Chute CG. Leveraging informatics for genetic studies: use of the electronic medical record to enable a genomewide association study of peripheral arterial disease. J Am Med Inform Assoc 2010;17:568–574.
    Google ScholarLocate open access versionFindings
  • Pathak J, Wang J, Kashyap S, et al. Mapping clinical phenotype data elements to standardized metadata repositories and controlled terminologies: the eMERGE Network experience. J Am Med Inform Assoc 2011;18:376–386.
    Google ScholarLocate open access versionFindings
  • Blumenthal D, Tavenner M. The “meaningful use” regulation for electronic health records. N Engl J Med 2010;363:501–504.
    Google ScholarLocate open access versionFindings
  • Kho AN, Pacheco JA, Peissig PL, et al. Electronic medical records for genetic research: results of the eMERGE consortium. Sci Transl Med 2011;3:79re1.
    Google ScholarLocate open access versionFindings
  • Zuvich RL, Armstrong LL, Bielinski SJ, et al. Pitfalls of merging GWAS data: lessons learned in the eMERGE network and quality control procedures to maintain high data quality. Genet Epidemiol 2011;35:887–898.
    Google ScholarLocate open access versionFindings
  • National Center for Biotechnology Information. Database of Genotypes and Phenotypes (dbGaP). 2013. http://www.ncbi.nlm.nih.gov/gap.
    Findings
  • Manolio TA, Chisholm RL, Ozenberger B, et al. Implementing genomic medicine in the clinic: the future is here. Genet Med 2013;15:258–267.
    Google ScholarLocate open access versionFindings
  • NIH Pharmacogenomics Research Network.. http://www.nigms.nih.gov/ Research/FeaturedPrograms/PGRN. Accessed April 2012.
    Findings
  • Chute CG, Kohane IS. Genomic medicine, health information technology, and patient care. JAMA 2013;309:1467–1468.
    Google ScholarLocate open access versionFindings
  • Starren J, Williams MS, Bottinger EP. Crossing the omic chasm: a time for omic ancillary systems. JAMA 2013;309:1237–1238.
    Google ScholarLocate open access versionFindings
  • Shah NH. Translational bioinformatics embraces big data. Yearb Med Inform 2012;7:130–134.
    Google ScholarLocate open access versionFindings
  • Computing Community Consortium. Challenges and Opportunities with Big Data, 2012. http://www.cra.org/ccc/files/docs/init/bigdatawhitepaper.pdf.
    Findings
  • NIH National Heart Lung and Blood Institute. PFINDR: Phenotype Finder IN Data Resources: A tool to support cross-study data discovery among NHLBI genomic studies RFA-HL-11–020 2011. http://grants.nih.gov/grants/guide/rfa-files/RFAHL-11-020.html. Accessed 9 April 2013.
    Findings
  • Murdoch TB, Detsky AS. The inevitable application of big data to health care. JAMA 2013;309:1351–1352.
    Google ScholarLocate open access versionFindings
  • Gulley ML, Braziel RM, Halling KC, et al.; Molecular Pathology Resource Committee, College of American Pathologists. Clinical laboratory reports in molecular pathology. Arch Pathol Lab Med 2007;131:852–863.
    Google ScholarLocate open access versionFindings
  • Lubin IM, Caggana M, Constantin C, et al. Ordering molecular genetic tests and reporting results: practices in laboratory and clinical settings. J Mol Diagn 2008;10:459–468.
    Google ScholarLocate open access versionFindings
  • Scheuner MT, Hilborne L, Brown J, Lubin IM; members of the RAND Molecular Genetic Test Report Advisory Board. A report template for molecular genetic tests designed to improve communication between the clinician and laboratory. Genet Test Mol Biomarkers 2012;16:761–769.
    Google ScholarLocate open access versionFindings
  • Baskovich BW, Allan RW. Web-based synoptic reporting for cancer checklists. J Pathol Inform 2011;2:16.
    Google ScholarLocate open access versionFindings
  • HL7 Version 2 Implementation Guide: Clinical Genomics; Fully LOINC-Qualified Genetic Variation Model, Release 2, 2013, Health Level Seven International: Ann Arbor, MI.
    Google ScholarFindings
  • HL7 Implementation Guide for CDA ® Release 2: Genetic Testing Report (GTR), DSTU Release 1, 2013, Health Level Seven International: Ann Arbor, MI.
    Google ScholarLocate open access versionFindings
  • Bradley CA, Rolka H, Walker D, Loonsk J. BioSense: implementation of a National Early Event Detection and Situational Awareness System. MMWR Morb Mortal Wkly Rep 2005;54(suppl):11–19.
    Google ScholarLocate open access versionFindings
  • Gichoya J, Gamache RE, Vreeman DJ, Dixon BE, Finnell JT, Grannis S. An evaluation of the rates of repeat notifiable disease reporting and patient crossover using a health information exchange-based automated electronic laboratory reporting system. AMIA Annu Symp Proc 2012;2012:1229–1236.
    Google ScholarLocate open access versionFindings
  • Rea S, Pathak J, Savova G, et al. Building a robust, scalable and standards-driven infrastructure for secondary use of EHR data: the SHARPn project. J Biomed Inform 2012;45:763–771.
    Google ScholarLocate open access versionFindings
  • North American Association of Central Cancer Registries Inc., Standards for Cancer Registries Volume V: Pathology Laboratory Electronic Reporting, 2011:310. http://www.naaccr.org/LinkClick.aspx?fileticket=Po1eQNqGQF8%3D&tabid=136&mid=476.
    Locate open access versionFindings
  • Houser SH, Colquitt S, Clements K, Hart-Hester S. The impact of electronic health record usage on cancer registry systems in Alabama. Perspect Health Inf Manag 2012;9:1f.
    Google ScholarLocate open access versionFindings
  • Next Generation Sequencing: Standardization of Clinical Testing (Nex-StoCT) Working Groups. http://www.cdc.gov/osels/lspppo/Genetic_Testing_Quality_ Practices/Nex-StoCT.html. Accessed 13 April 2013.
    Findings
  • Pandey A. A piece of my mind. Preparing for the 21st-century patient. JAMA 2013;309:1471–1472.
    Google ScholarLocate open access versionFindings
  • Langmead B, Hansen KD, Leek JT. Cloud-scale RNA-sequencing differential expression analysis with Myrna. Genome Biol 2010;11:R83.
    Google ScholarLocate open access versionFindings
  • Abecasis GR, Altshuler D, Auton A, et al. A map of human genome variation from population-scale sequencing. Nature 2010;467(7319):1061–1073.
    Google ScholarLocate open access versionFindings
  • Clarke L, Zheng-Bradley X, Smith R, et al.; 1000 Genomes Project Consortium. The 1000 Genomes Project: data management and community access. Nat Methods 2012;9:459–462.
    Google ScholarLocate open access versionFindings
  • Rehm HL. Disease-targeted sequencing: a cornerstone in the clinic. Nat Rev Genet 2013;14:295–300.
    Google ScholarLocate open access versionFindings
  • Jourdren L, Bernard M, Dillies MA, Le Crom S. Eoulsan: a cloud computingbased framework facilitating high throughput sequencing analyses. Bioinformatics 2012;28:1542–1543.
    Google ScholarLocate open access versionFindings
  • Stonebraker M. The case for shared nothing. Database Eng. 1986;9:4–9.
    Google ScholarLocate open access versionFindings
  • Dean J, Ghemawat S. MapReduce: Simplified Data Processing on Large Clusters in OSDI’04: Sixth Symposium on Operating System Design and Implementation 2004. San Francisco, CA, USA. 41. Ghemawat S, Gobioff H, Leung S-T. The Google File System, in 19th Symposium on Operating Systems Principles: SOSP’032003. Lake George, Bolton Landing, NY, USA. 42. Apache Foundation. Mahout. http://mahout.apache.org/. Accessed 9 Apr 2013.
    Locate open access versionFindings
  • 43. Schatz MC. CloudBurst: highly sensitive read mapping with MapReduce. Bioinformatics 2009;25:1363–1369.
    Google ScholarLocate open access versionFindings
  • 44. Langmead B, Schatz MC, Lin J, Pop M, Salzberg SL. Searching for SNPs with cloud computing. Genome Biol 2009;10:R134.
    Google ScholarLocate open access versionFindings
  • 45. Schatz M, Chambers J, Gupta A, et al. Contrail: Assembly of Large Genomes using Cloud Computing. http://sourceforge.net/apps/mediawiki/contrail-bio/index.php?title=Contrail.
    Findings
  • 46. CRS4 - Center for Advanced Studies Research and Development: Sardinia. Seal. http://biodoop-seal.sourceforge.net/index.html. Accessed 9 April 2013.
    Findings
  • 47. Niemenmaa M, Kallio A, Schumacher A, Klemelä P, Korpelainen E, Heljanko K. Hadoop-BAM: directly manipulating next generation sequencing data in the cloud. Bioinformatics 2012;28:876–877.
    Google ScholarLocate open access versionFindings
  • 48. SeqPig. http://sourceforge.net/projects/seqpig/. Accessed 9 April 2013.
    Findings
  • 49. Apache Foundation. Pig. http://pig.apache.org/. Accessed 19 June 2013.
    Findings
  • 50. NCBI Reference Sequence Database (RefSeq). http://www.ncbi.nlm.nih.gov/refseq/.51. Nomenclature for the description of sequence variants.http://www.hgvs.org/mutnomen/.
    Findings
  • 52. Database of Single Nucleotide Polymorphisms (dbSNP). http://www.ncbi.nlm.nih.gov/projects/SNP/.
    Findings
  • 53. Catalogue of Somatic Mutations in Cancer (COSMIC). http://cancer.sanger.ac.uk/cancergenome/projects/cosmic/.
    Findings
  • 54. CG Working Group Meeting Minutes. http://wiki.hl7.org/index.php?title=CG_ Working_Group_Meeting_Minutes.
    Locate open access versionFindings
  • 55. MedGen: Human Medical Genetics database. http://www.ncbi.nlm.nih.gov/medgen.
    Findings
  • 56. dbVAR: database of genomic structural variation. http://www.ncbi.nlm.nih.gov/dbvar/.
    Findings
  • 57. ClinVar: Clinical Variant database. http://www.ncbi.nlm.nih.gov/clinvar/.58. NCBI’s Genetic Test Repository (GTR).http://www.ncbi.nlm.nih.gov/gtr/.59. RxNORM: Normalized names for clinical drugs.http://www.nlm.nih.gov/ research/umls/rxnorm/. 60. SNOMED Clinical Terms® (SNOMED CT®). http://www.nlm.nih.gov/research/
    Locate open access versionFindings
  • 61. Online Mendelian Inheritance in Man (OMIM). http://www.ncbi.nlm.nih.gov/omim.
    Findings
  • 62. PubMed. http://www.ncbi.nlm.nih.gov/pubmed.
    Findings
  • 63. The Pharmacogenomics Knowledge Base (PharmGKB). http://www.pharmgkb. org/. 64. ClinicalTrials.gov. http://clinicaltrials.gov/.65. HL7 Version 2 Implementation Guide: Clinical Genomics; Fully LOINC-Qualified
    Locate open access versionFindings
  • Cytogenetics Model, Release 1, Health Level Seven International: Ann Arbor, MI. 66. Logical Observation Identifiers Names and Codes (LOINC®). http://loinc.org/.
    Findings
  • Accessed 15 April 2013.
    Google ScholarFindings
  • 67. Halevy A, Norvig P, Pereira F. The unreasonable effectiveness of data. IEEE Intelligent Syst 2009;24(2):8–12.
    Google ScholarLocate open access versionFindings
  • 68. Manolio TA, Collins FS, Cox NJ, et al. Finding the missing heritability of complex diseases. Nature 2009;461:747–753.
    Google ScholarLocate open access versionFindings
您的评分 :
0

 

标签
评论
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科