AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
Fusion gate Inspired by the gated network mechanism in variants of RNN such as LSTM and GRU, we introduced a gate to leverage distributed representation and one-hot representation

Distributed representation and one-hot representation fusion with gated network for clinical semantic textual similarity.

BMC Medical Informatics and Decision Making, no. suppl (2020): 1-7

Cited by: 0|Views10
WOS

Abstract

Semantic textual similarity (STS) is a fundamental natural language processing (NLP) task which can be widely used in many NLP applications such as Question Answer (QA), Information Retrieval (IR), etc. It is a typical regression problem, and almost all STS systems either use distributed representation or one-hot representation to model s...More

Code:

Data:

0
Introduction
  • Semantic textual similarity (STS) is a fundamental natural language processing (NLP) task which can be widely used in many NLP applications such as Question Answer (QA), Information Retrieval (IR), etc.
  • To aggregate data from diverse sources and minimize data redundancy, BioCreative/ OHNLP organized a shared task to evaluate the semantic similarity between text snippets of clinical texts in 2018.
  • BioCreative/OHNLP first organized an STS shared task in 2018
Highlights
  • Semantic textual similarity (STS) is a fundamental natural language processing (NLP) task which can be widely used in many NLP applications such as Question Answer (QA), Information Retrieval (IR), etc
  • Distributed representation In this study, we investigated three types of distributed representations: Siamese Convolutional Neural Network (CNN) [32], Siamese RNN [21] and Bidirectional Encoder Representations from Transformers (BERT) [33], where Siamese CNN and Siamese RNN are two popular neural networks used to represent sentence pair, while BERT is a new language representation method proposed recently
  • Fusion gate Inspired by the gated network mechanism in variants of RNN such as LSTM and GRU (Gated Recurrent Unit), we introduced a gate to leverage distributed representation and one-hot representation
  • In this study, we investigated three state-of-the-art distributed representation methods, that is, CNN, Bi-Table 2 Performance of systems on the clinical STS corpus of the BioCreative/OHNLP shared task in 2018
  • LSTM, and BERT, and proposed a novel framework based on a gated network to fuse distributed representation and one-hot representation of sentence pairs
  • In this paper, we proposed a novel framework to fuse distributed representation and one-hot representation using a gated network for clinical STS
Methods
  • The authors proposed a novel framework based on a gated network to fuse distributed representation and one-hot representation of sentence pairs.
  • Task definition Formally, the clinical STS task is to determine the similarity of a pair of given sentences, denoted by sim(s1, s2), where s1 is a sentence of length m and s2 is a sentence of length n.
  • The sentence “Indication, Site, and Additional Prescription Instructions: Apply 1 patch every 24 hours; leave on for up to 12 hours within a 24 hour period” became “indication site additional prescription instruction apply one patch every twenty four hour leave twelve hour within twenty four hour period” after preprocessing
Results
  • Compared with the systems only using distributed representation or one-hot representation, the method achieved much higher Pearson correlation.
  • The highest Person correlation of the system was 0.8541, higher than the best official one of the BioCreative/OHNLP clinical STS shared task in 2018 (0.8328) by 0.0213.
  • The authors' system achieved highest Pearson correlation of 0.8541 when using BERT for fusion, higher than the best official one of the BioCreative/ OHNLP clinical STS shared task (0.8328) by 0.0213
Conclusion
  • The authors investigated three state-of-the-art distributed representation methods, that is, CNN, Bi-Distributed representation and one-hot representation are complementary to each other and can be fused by gated network.
    Xiong et al BMC Medical Informatics and Decision Making 2020, 20(Suppl 1):72.
Tables
  • Table1: Annotated examples
  • Table2: Performance of systems on the clinical STS corpus of the BioCreative/OHNLP shared task in 2018
Download tables as Excel
Related work
  • There are two main types of sentence representation: (1) sparse one-hot representation based on manually extracted features. (2) densely distributed representation learnt from large labeled data. Within a long period, there have been a large number of feature extraction methods proposed to represent sentence by one-hot vector. Gomaa et al [9] summarized several types of features and various similarity computation methods: string-based similarity computation methods such as Ngram [10,11,12], corpus-based similarity methods [13,14,15,16] and knowledge-based similarity computation methods [17,18,19]. In recent years, neural networks have became mainstream methods for sentence representation and STS. Bromley et al [20] firstly presented a Siamese architecture to encode sentence pairs. Based on previous work, Mueller et al [21] used Siamese recurrent architecture learning sentence representation. Tang et al [22] used deep belief network to learn sentence representation. He et al [23] proposed a novel pairwise word interaction method to measure the sentence semantic similarity. Gong et al [24] further hierarchically extracted semantic features from interaction space. Tai et al [25] used tree-structured LSTM to improve the sentence representation. Subramanian et al [26] used transfer learning to learn sentence representation. In recent years, neural language models have been also ultilized for sentence representation, such as ELMo [27] and GPT [28]. Some researchers extracted features at different granularities and combined them with distributed representations, such as He et al [29] and Wang et al [30]. Ji et al [31] combined the features with distributed representation, our work was similar to Ji’s work, but we used a novel gate to choose how to combine one-hot representation and distributed representation.
Funding
  • This work is supported in part by grants: NSFCs (National Natural Science Foundations of China) (U1813215, 61876052 and 61573118), Special Foundation for Technology Research Program of Guangdong Province (2015B010131010), Strategic Emerging Industry Development Special Funds of Shenzhen (JCYJ20170307150528934 and JCYJ20180306172232154), Innovation Fund of Harbin Institute of Technology (HIT.NSRIF.2017052).
Study subjects and analysis
sentence pairs with semantic similarity ranging from 0 to 5: 750
For example, the sentence “Indication, Site, and Additional Prescription Instructions: Apply 1 patch every 24 hours; leave on for up to 12 hours within a 24 hour period” became “indication site additional prescription instruction apply one patch every twenty four hour leave twelve hour within twenty four hour period” after preprocessing. Dataset The BioCreative/OHNLP organizer manually annotated 750 sentence pairs with semantic similarity ranging from 0 to 5 for system development and 318 sentence pairs for system test. We further divided the 750 sentence pairs into a training set and a develop set using stratified sampling to guarantee that the develop set is a representative of the overall dataset

sentence pairs: 750
Dataset The BioCreative/OHNLP organizer manually annotated 750 sentence pairs with semantic similarity ranging from 0 to 5 for system development and 318 sentence pairs for system test. We further divided the 750 sentence pairs into a training set and a develop set using stratified sampling to guarantee that the develop set is a representative of the overall dataset. Figure 1 shows the fractional similarity interval distribution in the training, development and test sets, and Table 1 lists some annotated examples

clinicians: 460
However, the quality of EHRs has met challenges such as frequent use of copy-and-paste, templates, and smart phrases which lead to bloated or erroneous clinical notes [1]. A study of 23,630 clinical notes written by 460 clinicians showed that 46% of the text in the clinical records copied other clinical records, 36% was imported from templates, and only 18% was manually entered [2]. To aggregate data from diverse sources and minimize data redundancy, BioCreative/ OHNLP organized a shared task to evaluate the semantic similarity between text snippets (also called sentences in this paper) of clinical texts in 2018

Reference
  • Zhang R, Pakhomov S, McInnes BT, Melton GB. Evaluating measures of redundancy in clinical texts. In: Proceedings of American Medical Informatics Association Annual Symposium. AMIA; 2011. p. 1612.
    Google ScholarLocate open access versionFindings
  • Wang MD, Khanna R, Najafi N. Characterizing the source of text in electronic health record progress notes. JAMA Intern Med. 2017;177:1212–3.
    Google ScholarLocate open access versionFindings
  • Agirre E, Banea C, Cardie C, Cer D, Diab M, Gonzalez-Agirre A, et al. Semeval-2014 task 10: multilingual semantic textual similarity. In: Proceedings of the 8th international workshop on semantic evaluation (SemEval 2014); 2014. p. 81–91.
    Google ScholarLocate open access versionFindings
  • Agirre E, Cer D, Diab M, Gonzalez-Agirre A, Guo W. * SEM 2013 shared task: semantic textual similarity. In: Second joint conference on lexical and computational semantics (* SEM), volume 1: proceedings of the Main conference and the shared task: semantic textual similarity; 2013. p. 32–43.
    Google ScholarLocate open access versionFindings
  • Agirre E, Diab M, Cer D, et al. Semeval-2012 task 6: A pilot on semantic textual similarity. In: Proceedings of the 6th International Workshop on Semantic Evaluation. (SemEval 2012); 2012. p. 385–93.
    Google ScholarLocate open access versionFindings
  • Agirre E, Banea C, Cardie C, Cer D, Diab M, Gonzalez-Agirre A, et al. Semeval-2015 task 2: semantic textual similarity, english, spanish and pilot on interpretability. In: Proceedings of the 9th international workshop on semantic evaluation (SemEval 2015); 2015. p. 252–63.
    Google ScholarLocate open access versionFindings
  • Agirre E, Banea C, Cer D, Diab M, Gonzalez-Agirre A, Mihalcea R, et al. Semeval-2016 task 1: semantic textual similarity, monolingual and crosslingual evaluation. In: Proceedings of the 10th international workshop on semantic evaluation (SemEval-2016); 2016. p. 497–511.
    Google ScholarLocate open access versionFindings
  • Cera D, Diabb M, Agirrec E, Lopez-Gazpioc I, Speciad L, Donostia BC. SemEval-2017 task 1: semantic textual similarity multilingual and crosslingual focused evaluation; 2017.
    Google ScholarFindings
  • Gomaa WH, Fahmy AA. A survey of text similarity approaches. Int J Comput Appl. 2013;68:13–8.
    Google ScholarLocate open access versionFindings
  • Barrón-Cedeno A, Rosso P, Agirre E, Labaka G. Plagiarism detection across distant language pairs. In: Proceedings of the 23rd international conference on computational linguistics (Coling 2010). Beijing: Coling 2010 Organizing Committee; 2010. p. 37–45.
    Google ScholarLocate open access versionFindings
  • Wan S, Dras M, Dale R, Paris C. Using dependency-based features to take the’para-farce’out of paraphrase. In: Proceedings of the Australasian language technology workshop 2006; 2006. p. 131–8.
    Google ScholarLocate open access versionFindings
  • Madnani N, Tetreault J, Chodorow M. Re-examining machine translation metrics for paraphrase identification. In: Proceedings of the 2012 conference of the north American chapter of the Association for Computational Linguistics: human language technologies. USA: Association for Computational Linguistics; 2012. p. 182–90.
    Google ScholarLocate open access versionFindings
  • Landauer TK, Dumais ST. A solution to Plato’s problem: the latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychol Rev. 1997;104:211.
    Google ScholarLocate open access versionFindings
  • Lund K, Burgess C. Producing high-dimensional semantic spaces from lexical co-occurrence. Behav Res Methods Instrum Comput. 1996;28:203–8.
    Google ScholarLocate open access versionFindings
  • Gabrilovich E, Markovitch S. Computing Semantic Relatedness Using Wikipedia-Based Explicit Semantic Analysis. In: Proceedings of the 20th International Joint Conference on Artifical Intelligence. San Francisco: Morgan Kaufmann Publishers Inc.; 2007. p. 1606–11.
    Google ScholarLocate open access versionFindings
  • Jiang JJ, Conrath DW. Semantic similarity based on corpus statistics and lexical taxonomy. In: Proceedings of the 10th research on computational linguistics international conference; 1997. p. 19–33.
    Google ScholarLocate open access versionFindings
  • Lin D. Book Reviews: WordNet: An Electronic Lexical Database. Computational Linguistics. 1999;25. https://www.aclweb.org/anthology/J99-2008.
    Locate open access versionFindings
  • Fernando S, Stevenson M. A semantic similarity approach to paraphrase detection. In: Proceedings of the 11th annual research colloquium of the UK special interest Group for Computational Linguistics; 2008. p. 45–52.
    Google ScholarLocate open access versionFindings
  • Mihalcea R, Corley C, Strapparava C. Corpus-Based and Knowledge-Based Measures of Text Semantic Similarity. In: Proceedings of the 21st National Conference on Artificial Intelligence - Volume 1. AAAI Press; 2006. p. 775–80.
    Google ScholarLocate open access versionFindings
  • Bromley J, Guyon I, LeCun Y, Säckinger E, Shah R. Signature verification using a “siamese” time delay neural network. In: Advances in neural information processing systems; 1994. p. 737–44.
    Google ScholarFindings
  • Mueller J, Thyagarajan A. Siamese Recurrent Architectures for Learning Sentence Similarity. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence. AAAI Press; 2016. p. 2786–92.
    Google ScholarLocate open access versionFindings
  • Tang D, Qin B, Liu T, Li Z. Learning sentence representation for emotion classification on microblogs. In: Proceedings of Natural Language Processing and Chinese Computing. Springer; 2013. p. 212–23.
    Google ScholarLocate open access versionFindings
  • He H, Lin J. Pairwise word interaction modeling with deep neural networks for semantic similarity measurement. In: Proceedings of the 2016 conference of the north American chapter of the Association for Computational Linguistics: human language technologies; 2016. p. 937–48.
    Google ScholarLocate open access versionFindings
  • Gong Y, Luo H, Zhang J. Natural language inference over interaction spaceArXiv Prepr ArXiv170904348; 2017.
    Google ScholarFindings
  • Tai KS, Socher R, Manning CD. Improved semantic representations from tree-structured long short-term memory networks. In: Proceedings of the 53rd annual meeting of the Association for Computational Linguistics and the 7th international joint conference on natural language processing (volume 1: long papers); 2015. p. 1556–66.
    Google ScholarLocate open access versionFindings
  • Subramanian S, Trischler A, Bengio Y, Pal CJ. Learning general purpose distributed sentence representations via large scale multi-task learningArXiv Prepr ArXiv180400079; 2018.
    Google ScholarFindings
  • Peters M, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, et al. Deep contextualized word representations. In: Proceedings of the 2018 conference of the north American chapter of the Association for Computational Linguistics: human language technologies, volume 1 (long papers); 2018. p. 2227–37.
    Google ScholarLocate open access versionFindings
  • Radford A, Narasimhan K, Salimans T, et al. Improving language understanding by generative pre-training[J]. 2018. https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf.
    Findings
  • He H, Gimpel K, Lin J. Multi-perspective sentence similarity modeling with convolutional neural networks. In: Proceedings of the 2015 conference on empirical methods in natural language processing; 2015. p. 1576–86.
    Google ScholarLocate open access versionFindings
  • Wang Z, Hamza W, Florian R. Bilateral Multi-Perspective Matching for Natural Language Sentences. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence. AAAI Press; 2017. p. 4144–50.
    Google ScholarLocate open access versionFindings
  • Ji Y, Eisenstein J. Discriminative improvements to distributional sentence similarity. In: Proceedings of the 2013 conference on empirical methods in natural language processing; 2013. p. 891–6.
    Google ScholarLocate open access versionFindings
  • Yin W, Schütze H, Xiang B, Zhou B. ABCNN: attention-based convolutional neural network for modeling sentence pairs. Trans Assoc Comput Linguist. 2016;4:259–72.
    Google ScholarLocate open access versionFindings
  • Devlin J, Chang M-W, Lee K, Toutanova K. BERT: pre-training of deep bidirectional transformers for language understanding. CoRR; 2018. p. abs/ 1810.04805. http://arxiv.org/abs/1810.04805.
    Findings
  • Tian J, Zhou Z, Lan M, Wu Y. ECNU at SemEval-2017 task 1: leverage kernel-based traditional NLP features and neural networks to build a universal model for multilingual and cross-lingual semantic textual similarity. In: Proceedings of the 11th international workshop on semantic evaluation (SemEval-2017); 2017. p. 191–7.
    Google ScholarLocate open access versionFindings
  • Salton G, Buckley C. Term-weighting approaches in automatic text retrieval. Inf Process Manag. 1988;24:513–23.
    Google ScholarLocate open access versionFindings
Author
Shuai Chen
Shuai Chen
Haoming Qin
Haoming Qin
He Cao
He Cao
Yedan Shen
Yedan Shen
Xiaolong Wang
Xiaolong Wang
Qingcai Chen
Qingcai Chen
Jun Yan
Jun Yan
Your rating :
0

 

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科