AI helps you reading Science
AI generates interpretation videos
AI extracts and analyses the key points of the paper to generate videos automatically
AI parses the academic lineage of this thesis
AI extracts a summary of this paper
Distributed representation and one-hot representation fusion with gated network for clinical semantic textual similarity.
BMC Medical Informatics and Decision Making, no. suppl (2020): 1-7
- Semantic textual similarity (STS) is a fundamental natural language processing (NLP) task which can be widely used in many NLP applications such as Question Answer (QA), Information Retrieval (IR), etc.
- To aggregate data from diverse sources and minimize data redundancy, BioCreative/ OHNLP organized a shared task to evaluate the semantic similarity between text snippets of clinical texts in 2018.
- BioCreative/OHNLP first organized an STS shared task in 2018
- Semantic textual similarity (STS) is a fundamental natural language processing (NLP) task which can be widely used in many NLP applications such as Question Answer (QA), Information Retrieval (IR), etc
- Distributed representation In this study, we investigated three types of distributed representations: Siamese Convolutional Neural Network (CNN) , Siamese RNN  and Bidirectional Encoder Representations from Transformers (BERT) , where Siamese CNN and Siamese RNN are two popular neural networks used to represent sentence pair, while BERT is a new language representation method proposed recently
- Fusion gate Inspired by the gated network mechanism in variants of RNN such as LSTM and GRU (Gated Recurrent Unit), we introduced a gate to leverage distributed representation and one-hot representation
- In this study, we investigated three state-of-the-art distributed representation methods, that is, CNN, Bi-Table 2 Performance of systems on the clinical STS corpus of the BioCreative/OHNLP shared task in 2018
- LSTM, and BERT, and proposed a novel framework based on a gated network to fuse distributed representation and one-hot representation of sentence pairs
- In this paper, we proposed a novel framework to fuse distributed representation and one-hot representation using a gated network for clinical STS
- The authors proposed a novel framework based on a gated network to fuse distributed representation and one-hot representation of sentence pairs.
- Task definition Formally, the clinical STS task is to determine the similarity of a pair of given sentences, denoted by sim(s1, s2), where s1 is a sentence of length m and s2 is a sentence of length n.
- The sentence “Indication, Site, and Additional Prescription Instructions: Apply 1 patch every 24 hours; leave on for up to 12 hours within a 24 hour period” became “indication site additional prescription instruction apply one patch every twenty four hour leave twelve hour within twenty four hour period” after preprocessing
- Compared with the systems only using distributed representation or one-hot representation, the method achieved much higher Pearson correlation.
- The highest Person correlation of the system was 0.8541, higher than the best official one of the BioCreative/OHNLP clinical STS shared task in 2018 (0.8328) by 0.0213.
- The authors' system achieved highest Pearson correlation of 0.8541 when using BERT for fusion, higher than the best official one of the BioCreative/ OHNLP clinical STS shared task (0.8328) by 0.0213
- The authors investigated three state-of-the-art distributed representation methods, that is, CNN, Bi-Distributed representation and one-hot representation are complementary to each other and can be fused by gated network.
Xiong et al BMC Medical Informatics and Decision Making 2020, 20(Suppl 1):72.
- Table1: Annotated examples
- Table2: Performance of systems on the clinical STS corpus of the BioCreative/OHNLP shared task in 2018
- There are two main types of sentence representation: (1) sparse one-hot representation based on manually extracted features. (2) densely distributed representation learnt from large labeled data. Within a long period, there have been a large number of feature extraction methods proposed to represent sentence by one-hot vector. Gomaa et al  summarized several types of features and various similarity computation methods: string-based similarity computation methods such as Ngram [10,11,12], corpus-based similarity methods [13,14,15,16] and knowledge-based similarity computation methods [17,18,19]. In recent years, neural networks have became mainstream methods for sentence representation and STS. Bromley et al  firstly presented a Siamese architecture to encode sentence pairs. Based on previous work, Mueller et al  used Siamese recurrent architecture learning sentence representation. Tang et al  used deep belief network to learn sentence representation. He et al  proposed a novel pairwise word interaction method to measure the sentence semantic similarity. Gong et al  further hierarchically extracted semantic features from interaction space. Tai et al  used tree-structured LSTM to improve the sentence representation. Subramanian et al  used transfer learning to learn sentence representation. In recent years, neural language models have been also ultilized for sentence representation, such as ELMo  and GPT . Some researchers extracted features at different granularities and combined them with distributed representations, such as He et al  and Wang et al . Ji et al  combined the features with distributed representation, our work was similar to Ji’s work, but we used a novel gate to choose how to combine one-hot representation and distributed representation.
- This work is supported in part by grants: NSFCs (National Natural Science Foundations of China) (U1813215, 61876052 and 61573118), Special Foundation for Technology Research Program of Guangdong Province (2015B010131010), Strategic Emerging Industry Development Special Funds of Shenzhen (JCYJ20170307150528934 and JCYJ20180306172232154), Innovation Fund of Harbin Institute of Technology (HIT.NSRIF.2017052).
- Zhang R, Pakhomov S, McInnes BT, Melton GB. Evaluating measures of redundancy in clinical texts. In: Proceedings of American Medical Informatics Association Annual Symposium. AMIA; 2011. p. 1612.
- Wang MD, Khanna R, Najafi N. Characterizing the source of text in electronic health record progress notes. JAMA Intern Med. 2017;177:1212–3.
- Agirre E, Banea C, Cardie C, Cer D, Diab M, Gonzalez-Agirre A, et al. Semeval-2014 task 10: multilingual semantic textual similarity. In: Proceedings of the 8th international workshop on semantic evaluation (SemEval 2014); 2014. p. 81–91.
- Agirre E, Cer D, Diab M, Gonzalez-Agirre A, Guo W. * SEM 2013 shared task: semantic textual similarity. In: Second joint conference on lexical and computational semantics (* SEM), volume 1: proceedings of the Main conference and the shared task: semantic textual similarity; 2013. p. 32–43.
- Agirre E, Diab M, Cer D, et al. Semeval-2012 task 6: A pilot on semantic textual similarity. In: Proceedings of the 6th International Workshop on Semantic Evaluation. (SemEval 2012); 2012. p. 385–93.
- Agirre E, Banea C, Cardie C, Cer D, Diab M, Gonzalez-Agirre A, et al. Semeval-2015 task 2: semantic textual similarity, english, spanish and pilot on interpretability. In: Proceedings of the 9th international workshop on semantic evaluation (SemEval 2015); 2015. p. 252–63.
- Agirre E, Banea C, Cer D, Diab M, Gonzalez-Agirre A, Mihalcea R, et al. Semeval-2016 task 1: semantic textual similarity, monolingual and crosslingual evaluation. In: Proceedings of the 10th international workshop on semantic evaluation (SemEval-2016); 2016. p. 497–511.
- Cera D, Diabb M, Agirrec E, Lopez-Gazpioc I, Speciad L, Donostia BC. SemEval-2017 task 1: semantic textual similarity multilingual and crosslingual focused evaluation; 2017.
- Gomaa WH, Fahmy AA. A survey of text similarity approaches. Int J Comput Appl. 2013;68:13–8.
- Barrón-Cedeno A, Rosso P, Agirre E, Labaka G. Plagiarism detection across distant language pairs. In: Proceedings of the 23rd international conference on computational linguistics (Coling 2010). Beijing: Coling 2010 Organizing Committee; 2010. p. 37–45.
- Wan S, Dras M, Dale R, Paris C. Using dependency-based features to take the’para-farce’out of paraphrase. In: Proceedings of the Australasian language technology workshop 2006; 2006. p. 131–8.
- Madnani N, Tetreault J, Chodorow M. Re-examining machine translation metrics for paraphrase identification. In: Proceedings of the 2012 conference of the north American chapter of the Association for Computational Linguistics: human language technologies. USA: Association for Computational Linguistics; 2012. p. 182–90.
- Landauer TK, Dumais ST. A solution to Plato’s problem: the latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychol Rev. 1997;104:211.
- Lund K, Burgess C. Producing high-dimensional semantic spaces from lexical co-occurrence. Behav Res Methods Instrum Comput. 1996;28:203–8.
- Gabrilovich E, Markovitch S. Computing Semantic Relatedness Using Wikipedia-Based Explicit Semantic Analysis. In: Proceedings of the 20th International Joint Conference on Artifical Intelligence. San Francisco: Morgan Kaufmann Publishers Inc.; 2007. p. 1606–11.
- Jiang JJ, Conrath DW. Semantic similarity based on corpus statistics and lexical taxonomy. In: Proceedings of the 10th research on computational linguistics international conference; 1997. p. 19–33.
- Lin D. Book Reviews: WordNet: An Electronic Lexical Database. Computational Linguistics. 1999;25. https://www.aclweb.org/anthology/J99-2008.
- Fernando S, Stevenson M. A semantic similarity approach to paraphrase detection. In: Proceedings of the 11th annual research colloquium of the UK special interest Group for Computational Linguistics; 2008. p. 45–52.
- Mihalcea R, Corley C, Strapparava C. Corpus-Based and Knowledge-Based Measures of Text Semantic Similarity. In: Proceedings of the 21st National Conference on Artificial Intelligence - Volume 1. AAAI Press; 2006. p. 775–80.
- Bromley J, Guyon I, LeCun Y, Säckinger E, Shah R. Signature verification using a “siamese” time delay neural network. In: Advances in neural information processing systems; 1994. p. 737–44.
- Mueller J, Thyagarajan A. Siamese Recurrent Architectures for Learning Sentence Similarity. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence. AAAI Press; 2016. p. 2786–92.
- Tang D, Qin B, Liu T, Li Z. Learning sentence representation for emotion classification on microblogs. In: Proceedings of Natural Language Processing and Chinese Computing. Springer; 2013. p. 212–23.
- He H, Lin J. Pairwise word interaction modeling with deep neural networks for semantic similarity measurement. In: Proceedings of the 2016 conference of the north American chapter of the Association for Computational Linguistics: human language technologies; 2016. p. 937–48.
- Gong Y, Luo H, Zhang J. Natural language inference over interaction spaceArXiv Prepr ArXiv170904348; 2017.
- Tai KS, Socher R, Manning CD. Improved semantic representations from tree-structured long short-term memory networks. In: Proceedings of the 53rd annual meeting of the Association for Computational Linguistics and the 7th international joint conference on natural language processing (volume 1: long papers); 2015. p. 1556–66.
- Subramanian S, Trischler A, Bengio Y, Pal CJ. Learning general purpose distributed sentence representations via large scale multi-task learningArXiv Prepr ArXiv180400079; 2018.
- Peters M, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, et al. Deep contextualized word representations. In: Proceedings of the 2018 conference of the north American chapter of the Association for Computational Linguistics: human language technologies, volume 1 (long papers); 2018. p. 2227–37.
- Radford A, Narasimhan K, Salimans T, et al. Improving language understanding by generative pre-training[J]. 2018. https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf.
- He H, Gimpel K, Lin J. Multi-perspective sentence similarity modeling with convolutional neural networks. In: Proceedings of the 2015 conference on empirical methods in natural language processing; 2015. p. 1576–86.
- Wang Z, Hamza W, Florian R. Bilateral Multi-Perspective Matching for Natural Language Sentences. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence. AAAI Press; 2017. p. 4144–50.
- Ji Y, Eisenstein J. Discriminative improvements to distributional sentence similarity. In: Proceedings of the 2013 conference on empirical methods in natural language processing; 2013. p. 891–6.
- Yin W, Schütze H, Xiang B, Zhou B. ABCNN: attention-based convolutional neural network for modeling sentence pairs. Trans Assoc Comput Linguist. 2016;4:259–72.
- Devlin J, Chang M-W, Lee K, Toutanova K. BERT: pre-training of deep bidirectional transformers for language understanding. CoRR; 2018. p. abs/ 1810.04805. http://arxiv.org/abs/1810.04805.
- Tian J, Zhou Z, Lan M, Wu Y. ECNU at SemEval-2017 task 1: leverage kernel-based traditional NLP features and neural networks to build a universal model for multilingual and cross-lingual semantic textual similarity. In: Proceedings of the 11th international workshop on semantic evaluation (SemEval-2017); 2017. p. 191–7.
- Salton G, Buckley C. Term-weighting approaches in automatic text retrieval. Inf Process Manag. 1988;24:513–23.