AI帮你理解科学

AI 生成解读视频

AI抽取解析论文重点内容自动生成视频


pub
生成解读视频

AI 溯源

AI解析本论文相关学术脉络


Master Reading Tree
生成 溯源树

AI 精读

AI抽取本论文的概要总结


微博一下
The Bidirectional Encoder Representations from Transformers results we present in Table 1 are derived using a 60-token window

A BERT-based Universal Model for Both Within-and Cross-sentence Clinical Temporal Relation Extraction

Proceedings of the 2nd Clinical Natural Language Processing Workshop, pp.65-71, (2019)

被引用36|浏览174
下载 PDF 全文
引用
微博一下

摘要

Classic methods for clinical temporal relation extraction focus on relational candidates within a sentence. On the other hand, break-through Bidirectional Encoder Representations from Transformers (BERT) are trained on large quantities of arbitrary spans of contiguous text instead of sentences. In this study, we aim to build a sentence-ag...更多

代码

数据

0
简介
重点内容
  • The release of Bidirectional Encoder Representations from Transformers (BERT) (Devlin et al, 2018) has substantially advanced the state-of-the-art in several sentence-level, inter-sentence-level, and tokenlevel tasks
  • BERT is trained on very large unlabeled corpora to achieve good generalizability
  • BERT is able to make predictions that go beyond natural sentence boundaries, because it is trained on fragments of contiguous text that typically span multiple sentences
  • BERT can be further pre-trained for specific domains (Lee et al, 2019) or serve as a backbone model to be fine-tuned with one output layer for a wide range of tasks
  • For the task of clinical temporal relation extraction, recent years have seen the rise of neural approaches – structured perceptrons (Leeuwenberg and Moens, 2017), convolutional neural networks (CNNs) (Dligach et al, 2017; Lin et al, 2017), and Long Short-Term memory (LSTM) networks (Tourille et al, 2017; Dligach et al, 2017; Lin et al, 2018) – where minimally-engineered inputs have been adopted over heavily featureengineered techniques (Sun et al, 2013)
  • The BERT results we present in Table 1 are derived using a 60-token window
方法
  • 3.1 Task definition

    The authors process the THYME corpus using the segmentation and tokenization modules of Apache cTAKES.
  • The authors consume gold standard event annotations, gold time expressions and their classes (Styler IV et al, 2014) for generating instances of containment relation candidates.
  • Depending on the order of the entities, each instance can take one out of three gold standard relational labels, CONTAINS, CONTAINED-BY, and NONE.
  • The first line of Figure 1 is the token sequence for three gold standard entities, of which two are events, “surgery” and “scheduled”, and one is a time expression, “March 11, 2014”, whose time class is “date”.
结果
  • All models are evaluated by the standard Clinical TempEval evaluation script so that their performance can be directly compared to published results.
  • Table 1 shows performance on the Clinical TempEval colon cancer test set for the previous best systems, Lin et al (2018) and Galvan et al (2018), and window-based universal models.
  • Model Lin et al (2018) Galvan et al (2018) 1.
  • BERT-TS 5.
  • BERT-MIMIC-TS
结论
  • The window-based BERT-fine-tuned model, even with the XML-tags (Table 1(2)), works for both within- and cross-sentence relations.
  • Its perfor-.
  • #1: Today Mr A states that he feels well.
  • #2: The colonoscopy revealed a low rectal mass that was noncircumferential.
  • It was fungating , infiltrative , ulcerated , and about 4-cm in diameter.
  • It involved.
表格
  • Table1: Model performance of CONTAINS relation on colon cancer test set. T: using non-XML tags; S: adding high confidence positive silver instances
  • Table2: Model performance of CONTAINS relation on brain cancer test set
  • Table3: Within- vs. cross-sentence results on colon cancer development set
Download tables as Excel
基金
  • The study was funded by R01LM10090, R01GM114355 and U24CA184407
引用论文
  • Alan Akbik, Duncan Blythe, and Roland Vollgraf. 2018. Contextual string embeddings for sequence labeling. In Proceedings of the 27th International Conference on Computational Linguistics, pages 1638–1649.
    Google ScholarLocate open access versionFindings
  • Steven Bethard, Leon Derczynski, Guergana Savova, Guergana Savova, James Pustejovsky, and Marc Verhagen. 2015. Semeval-2015 task 6: Clinical tempeval. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), pages 806–814.
    Google ScholarLocate open access versionFindings
  • Steven Bethard, Guergana Savova, Wei-Te Chen, Leon Derczynski, James Pustejovsky, and Marc Verhagen. 2016. Semeval-2016 task 12: Clinical tempeval. Proceedings of SemEval, pages 1052–1062.
    Google ScholarLocate open access versionFindings
  • Steven Bethard, Guergana Savova, Martha Palmer, James Pustejovsky, and Marc Verhagen. 2017. Semeval-2017 task 12: Clinical tempeval. Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), pages 563–570.
    Google ScholarLocate open access versionFindings
  • Amar K Das and Mark A Musen. 199A comparison of the temporal expressiveness of three database query methods. In Proceedings of the Annual Symposium on Computer Application in Medical Care, page 331. American Medical Informatics Association.
    Google ScholarLocate open access versionFindings
  • Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
    Findings
  • Dmitriy Dligach, Timothy Miller, Chen Lin, Steven Bethard, and Guergana Savova. 201Neural temporal relation extraction. EACL 2017, page 746.
    Google ScholarLocate open access versionFindings
  • Diana Galvan, Naoaki Okazaki, Koji Matsuda, and Kentaro Inui. 201Investigating the challenges of temporal relation extraction from clinical text. In Proceedings of the Ninth International Workshop on Health Text Mining and Information Analysis, pages 55–64.
    Google ScholarLocate open access versionFindings
  • Alistair EW Johnson, Tom J Pollard, Lu Shen, Liwei H Lehman, Mengling Feng, Mohammad Ghassemi, Benjamin Moody, Peter Szolovits, Leo Anthony Celi, and Roger G Mark. 2016. Mimic-iii, a freely accessible critical care database. Scientific data, 3.
    Google ScholarFindings
  • Michael G Kahn, Larry M Fagan, and Samson Tu. 1990. Extensions to the time-oriented database model to support temporal reasoning in medical expert systems. Methods of information in medicine, 30(1):4–14.
    Google ScholarLocate open access versionFindings
  • Jinhyuk Lee, Wonjin Yoon, Sungdong Kim, Donghyeon Kim, Sunkyu Kim, Chan Ho So, and Jaewoo Kang. 2019. Biobert: a pre-trained biomedical language representation model for biomedical text mining. arXiv preprint arXiv:1901.08746.
    Findings
  • Tuur Leeuwenberg and Marie-Francine Moens. 2017. Structured learning for temporal relation extraction from clinical records. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Chen Lin, Elizabeth W Karlson, Dmitriy Dligach, Monica P Ramirez, Timothy A Miller, Huan Mo, Natalie S Braggs, Andrew Cagan, Vivian Gainer, Joshua C Denny, and Guergana K Savova. 2014. Automatic identification of methotrexate-induced liver toxicity in patients with rheumatoid arthritis from the electronic medical record. Journal of the American Medical Informatics Association.
    Google ScholarLocate open access versionFindings
  • Chen Lin, Timothy Miller, Dmitriy Dligach, Hadi Amiri, Steven Bethard, and Guergana Savova. 2018. Self-training improves recurrent neural networks performance for temporal relation extraction. In Proceedings of the Ninth International Workshop on Health Text Mining and Information Analysis, pages 165–176.
    Google ScholarLocate open access versionFindings
  • Chen Lin, Timothy Miller, Dmitriy Dligach, Steven Bethard, and Guergana Savova. 2017. Representations of time expressions for temporal relation extraction with convolutional neural networks. BioNLP 2017, pages 322–327.
    Google ScholarLocate open access versionFindings
  • Bryan McCann, James Bradbury, Caiming Xiong, and Richard Socher. 2017. Learned in translation: Contextualized word vectors. In Advances in Neural Information Processing Systems, pages 6294–6305.
    Google ScholarLocate open access versionFindings
  • Matthew E Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. arXiv preprint arXiv:1802.05365.
    Findings
  • James Pustejovsky, Jose M Castano, Robert Ingria, Roser Sauri, Robert J Gaizauskas, Andrea Setzer, Graham Katz, and Dragomir R Radev. 2003. Timeml: Robust specification of event and temporal expressions in text. New directions in question answering, 3:28–34.
    Google ScholarLocate open access versionFindings
  • James Pustejovsky and Amber Stubbs. 2011. Increasing informativeness in temporal annotation. In Proceedings of the 5th Linguistic Annotation Workshop, pages 152–160. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Alec Radford, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever. 2018. Improving language understanding by generative pre-training. URL https://s3us-west-2.amazonaws.com/openai-assets/researchcovers/languageunsupervised/language understanding paper.pdf.
    Findings
  • Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019. Language models are unsupervised multitask learners. URL https://d4mucfpksywv.cloudfront.net/betterlanguage-models/language-models.pdf.
    Findings
  • Reinhold Schmidt, Stefan Ropele, Christian Enzinger, Katja Petrovic, Stephen Smith, Helena Schmidt, Paul M Matthews, and Franz Fazekas. 2005. White matter lesion progression, brain atrophy, and cognitive decline: the austrian stroke prevention study. Annals of neurology, 58(4):610–616.
    Google ScholarLocate open access versionFindings
  • William F Styler IV, Steven Bethard, Sean Finan, Martha Palmer, Sameer Pradhan, Piet C de Groen, Brad Erickson, Timothy Miller, Chen Lin, Guergana Savova, et al. 2014. Temporal annotation in the clinical domain. Transactions of the Association for Computational Linguistics, 2:143–154.
    Google ScholarLocate open access versionFindings
  • Weiyi Sun, Anna Rumshisky, and Ozlem Uzuner. 2013. Evaluating temporal relations in clinical text: 2012 i2b2 challenge. Journal of the American Medical Informatics Association, 20(5):806–813.
    Google ScholarLocate open access versionFindings
  • Julien Tourille, Olivier Ferret, Aurelie Neveol, and Xavier Tannier. 2017. Neural architecture for temporal relation extraction: A bi-lstm approach for detecting narrative containers. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), volume 2, pages 224–230.
    Google ScholarLocate open access versionFindings
  • Haoyu Wang, Ming Tan, Mo Yu, Shiyu Chang, Dakuo Wang, Kun Xu, Xiaoxiao Guo, and Saloni Potdar. 2019. Extracting multiple-relations in onepass with pre-trained transformers. arXiv preprint arXiv:1902.01030.
    Findings
  • Li Zhou and George Hripcsak. 2007. Temporal reasoning with medical dataa review with emphasis on medical natural language processing. Journal of biomedical informatics, 40(2):183–202.
    Google ScholarLocate open access versionFindings
作者
Chen Lin
Chen Lin
Timothy Miller
Timothy Miller
Dmitriy Dligach
Dmitriy Dligach
Guergana Savova
Guergana Savova
您的评分 :
0

 

标签
评论
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科