Graph-Evolving Meta-Learning for Low-Resource Medical Dialogue Generation

Cited by: 0|Bibtex|Views35
Other Links: arxiv.org
Weibo:
We propose an end-to-end low-resource medical dialogue generation model which meta-learns a model initialization from source diseases with the ability of fast adaptation to new diseases

Abstract:

Human doctors with well-structured medical knowledge can diagnose a disease merely via a few conversations with patients about symptoms. In contrast, existing knowledge-grounded dialogue systems often require a large number of dialogue instances to learn as they fail to capture the correlations between different diseases and neglect the...More

Code:

Data:

0
Introduction
  • Medical dialogue system (MDS) aims to converse with patients to inquire additional symptoms beyond their selfreports and make a diagnosis automatically, which has gained increasing attention (Lin et al 2019; Wei et al 2018; Xu et al 2019).
  • Preliminary diagnosis reports generated by MDS may assist doctors to make a diagnosis more efficiently
  • Because of these considerable benefits, many researchers devote substantial efforts to address critical sub-problems in MDS, such as natural language understanding (Shi et al 2020; Lin et al 2019), Source diseases.
  • Fine-tuning such large language models in the medical domain requires sufficient task-specific data (Bansal, Jha, and McCallum 2019; Dou, Yu, and Anastasopoulos 2019) so as to learn the correlations between diseases and symptoms.
  • In practice, the disease-symptom relations of each disease may vary or evolve along with more cases, which is not considered in prior works
Highlights
  • Medical dialogue system (MDS) aims to converse with patients to inquire additional symptoms beyond their selfreports and make a diagnosis automatically, which has gained increasing attention (Lin et al 2019; Wei et al 2018; Xu et al 2019)
  • The second contribution is that we further develop a novel Graph-Evolving Meta-Learning (GEML) framework to transfer the diagnostic experience in the low-resource scenario
  • We propose to evolve the commonsense graph capitalized on the dialogue instances and learn the induced meta-knowledge graph during the meta-training and adaptation phases
  • For the CMDD dataset, comparing to two other base models (NKD and POKS), our meta-knowledge graph reasoning (MGR) always achieves the best performance in terms of both automatic and human evaluation, indicating the superiority of our end-to-end medical dialog model
  • We propose an end-to-end low-resource medical dialogue generation model which meta-learns a model initialization from source diseases with the ability of fast adaptation to new diseases
Methods
  • Automatic Metrics Target Disease 3 BLEU Enti.-F1.
  • Target Disease 4 BLEU Enti.-F1.
  • Human Evaluation Knowledge Generation Rationality Quality each instance.
  • The instances with very few turns or entities and with private information have been all discarded
Results
  • Automatic Evaluation.
  • The authors adopt two automatic metrics for performance comparisons as shown in Table 2.
  • To evaluate the generation quality, the authors utilize the average of the sentencelevel BLEU-1, 2, 3 and 4 (Chen and Cherry 2014) and denote it as BLEU.
  • To evaluate the success rate in the entity prediction task, the authors adopt Entity-F1, namely the F1 score between.
  • 1: Hello, doctor.
  • The author feels nauseous recently.
  • The author's stomach is dddsuffering from indigestion.
Conclusion
  • The authors propose an end-to-end low-resource medical dialogue generation model which meta-learns a model initialization from source diseases with the ability of fast adaptation to new diseases.
  • The authors develop a Graph-Evolving Meta-Learning (GEML) framework that learns to fast evolve a meta-knowledge graph for adapting to new diseases and reasoning the disease-symptom correlations.
  • The authors' dialogue generation model enjoys the fast learning ability and can well handle low-resource medical dialogue tasks.
  • Experiment results testify the advantages of the approach
Summary
  • Introduction:

    Medical dialogue system (MDS) aims to converse with patients to inquire additional symptoms beyond their selfreports and make a diagnosis automatically, which has gained increasing attention (Lin et al 2019; Wei et al 2018; Xu et al 2019).
  • Preliminary diagnosis reports generated by MDS may assist doctors to make a diagnosis more efficiently
  • Because of these considerable benefits, many researchers devote substantial efforts to address critical sub-problems in MDS, such as natural language understanding (Shi et al 2020; Lin et al 2019), Source diseases.
  • Fine-tuning such large language models in the medical domain requires sufficient task-specific data (Bansal, Jha, and McCallum 2019; Dou, Yu, and Anastasopoulos 2019) so as to learn the correlations between diseases and symptoms.
  • In practice, the disease-symptom relations of each disease may vary or evolve along with more cases, which is not considered in prior works
  • Methods:

    Automatic Metrics Target Disease 3 BLEU Enti.-F1.
  • Target Disease 4 BLEU Enti.-F1.
  • Human Evaluation Knowledge Generation Rationality Quality each instance.
  • The instances with very few turns or entities and with private information have been all discarded
  • Results:

    Automatic Evaluation.
  • The authors adopt two automatic metrics for performance comparisons as shown in Table 2.
  • To evaluate the generation quality, the authors utilize the average of the sentencelevel BLEU-1, 2, 3 and 4 (Chen and Cherry 2014) and denote it as BLEU.
  • To evaluate the success rate in the entity prediction task, the authors adopt Entity-F1, namely the F1 score between.
  • 1: Hello, doctor.
  • The author feels nauseous recently.
  • The author's stomach is dddsuffering from indigestion.
  • Conclusion:

    The authors propose an end-to-end low-resource medical dialogue generation model which meta-learns a model initialization from source diseases with the ability of fast adaptation to new diseases.
  • The authors develop a Graph-Evolving Meta-Learning (GEML) framework that learns to fast evolve a meta-knowledge graph for adapting to new diseases and reasoning the disease-symptom correlations.
  • The authors' dialogue generation model enjoys the fast learning ability and can well handle low-resource medical dialogue tasks.
  • Experiment results testify the advantages of the approach
Tables
  • Table1: Statistics of the CMDD dataset (<a class="ref-link" id="cLin_et+al_2019_a" href="#rLin_et+al_2019_a">Lin et al 2019</a>) and our Chunyu dataset
  • Table2: Results on the two datasets in terms of automatic metrics (×102) and human evaluation (on a 5-point scale). Top: For the CMDD dataset, target diseases from 1 to 4 refer to “bronchitis”, “functional dyspepsia”, “infantile diarrhea” and “upper respiratory infection” respectively. Bottom: For the Chunyu dataset, target diseases from 1 to 4 refer to “liver cirrhosis”, “ileus”, “pneumonia”, and “pancreatitis”
  • Table3: Results of ablation studies on two datasets (×102)
Download tables as Excel
Related work
  • Medical Dialogue System (MDS). Recent research on MDS mostly focus on the natural language understanding (NLU) or dialogue management (DM) with the line of pipeline-based dialogue system. Various NLU problems have been studied to improve the MDS performance, e.g., entity inference (Du et al 2019b; Lin et al 2019; Liu et al 2020), symptom extraction (Du et al 2019a) and slot-filling (Shi et al 2020). For medical dialogue management, most works (Dhingra et al 2017; Li et al 2017) focus on reinforcement learning (RL) based task-oriented dialogue system. Wei et al (2018) proposed to learn dialogue policy with RL to facilitate automatic diagnosis. Xu et al (2019) incorporated the knowledge inference into dialogue management via RL. However, no attention has been paid to the medical dialogue generation, which is a critical recipe in MDS. Differing from existing approaches, we investigate to build an end-to-end graph-guided medical dialogue generation model directly. Knowledge-grounded Dialog Generation. Recently, dialogue generation grounded on extra knowledge is emerging as an important step towards human-like conversational AI, where the knowledge could be derived from or open-domain knowledge graphs (Zhou et al 2018; Zhang et al 2020; Moon et al 2019) or retrieved from unstructured documents (Lian et al 2019; Zhao et al 2019; Kim, Ahn, and Kim 2020). Different from them, our MDG model is built on the dedicated medical-domain knowledge graph and further require evolving it to satisfy the need for the real-world diagnosis. Meta-Learning. By meta-training a model initialization from training tasks with the ability of fast adaptation to new tasks, meta-learning (Finn, Abbeel, and Levine 2017; Zhou et al 2019, 2020) has achieved promising results in many NLP areas, such as machine translation (Gu et al 2018), task-oriented dialogues (Qian and Yu 2019; Mi et al 2019), and text classification (2019; 2019). But there is the few effort to devote meta-learning into MDS, which requires grounding on the external medical knowledge and reasoning for disease-symptom correlations. In this work, we employ the Reptile (Nichol, Achiam, and Schulman 2018), one firstorder model-agnostic meta learning approach, because of its efficiency and effectiveness, and enhance it with the metaknowledge graph reasoning and evolving.
Funding
  • This work was supported in part by National Natural Science Foundation of China (NSFC) under Grant No.U19A2073 and No.61976233, Guangdong Province Basic and Applied Basic Research (Regional Joint Fund-Key) Grant No.2019B1515120039, Nature Science Foundation of Shenzhen Under Grant No 2019191361, Zhijiang Lab’s Open Fund (No 2020AA3AB14) and CSIG Young Fellow Support Fund
Study subjects and analysis
well-educated graduate students: 5
Human Evaluation. We invited five well-educated graduate students majoring in medicine to score 100 generated replies for each method. For each dataset, the evaluators are requested to grade each case in terms of “knowledge rationality” and “generation quality” independently ranging from 1 (strongly bad) to 5 (strongly good)

Reference
  • Bahdanau, D.; Cho, K.; and Bengio, Y. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.
    Findings
  • Bansal, T.; Jha, R.; and McCallum, A. 2019. Learning to FewShot Learn Across Diverse Natural Language Classification Tasks. arXiv preprint arXiv:1911.03863.
    Findings
  • Chen, B.; and Cherry, C. 2014. A systematic comparison of smoothing techniques for sentence-level bleu. In Proceedings of the Ninth Workshop on Statistical Machine Translation, 362–367.
    Google ScholarLocate open access versionFindings
  • Devlin, J.; Chang, M.-W.; Lee, K.; and Toutanova, K. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
    Findings
  • Dhingra, B.; Li, L.; Li, X.; Gao, J.; Chen, Y.-N.; Ahmed, F.; and Deng, L. 2017. Towards End-to-End Reinforcement Learning of Dialogue Agents for Information Access. In ACL, 484–495.
    Google ScholarLocate open access versionFindings
  • Dou, Z.-Y.; Yu, K.; and Anastasopoulos, A. 2019. Investigating Meta-Learning Algorithms for Low-Resource Natural Language Understanding Tasks. In EMNLP-IJCNLP, 1192– 1197.
    Google ScholarLocate open access versionFindings
  • Du, N.; Chen, K.; Kannan, A.; Tran, L.; Chen, Y.; and Shafran, I. 2019a. Extracting Symptoms and their Status from Clinical Conversations. In ACL, 915–925.
    Google ScholarLocate open access versionFindings
  • Du, N.; Wang, M.; Tran, L.; Lee, G.; and Shafran, I. 2019b. Learning to Infer Entities, Properties and their Relations from Clinical Conversations. In EMNLP-IJCNLP, 4978–4989.
    Google ScholarFindings
  • Finn, C.; Abbeel, P.; and Levine, S. 2017. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks. In ICML, 1126–1135.
    Google ScholarLocate open access versionFindings
  • Gardner, M.; Grus, J.; Neumann, M.; Tafjord, O.; Dasigi, P.; Liu, N.; Peters, M.; Schmitz, M.; and Zettlemoyer, L. 2018. Allennlp: A deep semantic natural language processing platform. arXiv preprint arXiv:1803.07640.
    Findings
  • Gu, J.; Wang, Y.; Chen, Y.; Li, V. O. K.; and Cho, K. 2018. Meta-Learning for Low-Resource Neural Machine Translation. In EMNLP, 3622–3631.
    Google ScholarLocate open access versionFindings
  • Hochreiter, S.; and Schmidhuber, J. 1997. Long short-term memory. Neural computation 9(8): 1735–1780.
    Google ScholarLocate open access versionFindings
  • Kao, H.-C.; Tang, K.-F.; and Chang, E. Y. 2018. ContextAware Symptom Checking for Disease Diagnosis Using Hierarchical Reinforcement Learning. In AAAI, 2305–2313.
    Google ScholarFindings
  • Kim, B.; Ahn, J.; and Kim, G. 2020. Sequential Latent Knowledge Selection for Knowledge-Grounded Dialogue. arXiv preprint arXiv:2002.07510.
    Findings
  • Li, X.; Chen, Y.-N.; Li, L.; Gao, J.; and Celikyilmaz, A. 2017. End-to-End Task-Completion Neural Dialogue Systems. In IJCNLP, 733–743.
    Google ScholarLocate open access versionFindings
  • Lian, R.; Xie, M.; Wang, F.; Peng, J.; and Wu, H. 2019. Learning to select knowledge for response generation in dialog systems. IJCAI.
    Google ScholarLocate open access versionFindings
  • Lin, X.; He, X.; Chen, Q.; Tou, H.; Wei, Z.; and Chen, T. 2019. Enhancing Dialogue Symptom Diagnosis with Global Attention and Symptom Graph. In EMNLP-IJCNLP, 5032– 5041.
    Google ScholarLocate open access versionFindings
  • Liu, S.; Chen, H.; Ren, Z.; Feng, Y.; Liu, Q.; and Yin, D. 20Knowledge Diffusion for Neural Dialogue Generation. In ACL, 1489–1498.
    Google ScholarLocate open access versionFindings
  • Liu, W.; Tang, J.; Qin, J.; Xu, L.; Li, Z.; and Liang, X. 2020. MedDG: A Large-scale Medical Consultation Dataset for Building Medical Dialogue System. arXiv preprint arXiv:2010.07497.
    Findings
  • Luo, R.; Xu, J.; Zhang, Y.; Ren, X.; and Sun, X. 2019. PKUSEG: A Toolkit for Multi-Domain Chinese Word Segmentation. CoRR abs/1906.11455. URL https://arxiv.org/abs/1906.11455.
    Findings
  • Mi, F.; Huang, M.; Zhang, J.; and Faltings, B. 2019. MetaLearning for Low-resource Natural Language Generation in Task-oriented Dialogue Systems. In IJCAI, 3151–3157.
    Google ScholarFindings
  • Moon, S.; Shah, P.; Kumar, A.; and Subba, R. 2019. Opendialkg: Explainable conversational reasoning with attentionbased walks over knowledge graphs. In ACL, 845–854.
    Google ScholarLocate open access versionFindings
  • Nichol, A.; Achiam, J.; and Schulman, J. 2018. On first-order meta-learning algorithms. arXiv preprint arXiv:1803.02999.
    Findings
  • Obamuyide, A.; and Vlachos, A. 2019. Model-Agnostic Meta-Learning for Relation Classification with Limited Supervision. In ACL, 5873–5879.
    Google ScholarFindings
  • Qian, K.; and Yu, Z. 2019. Domain Adaptive Dialog Generation via Meta Learning. In ACL, 2639–2649.
    Google ScholarFindings
  • Radford, A.; Wu, J.; Child, R.; Luan, D.; Amodei, D.; and Sutskever, I. 2019. Language models are unsupervised multitask learners. OpenAI blog 1(8): 9.
    Google ScholarLocate open access versionFindings
  • See, A.; Liu, P. J.; and Manning, C. D. 2017. Get To The Point: Summarization with Pointer-Generator Networks. In ACL, 1073–1083.
    Google ScholarFindings
  • Serban, I. V.; Sordoni, A.; Bengio, Y.; Courville, A.; and Pineau, J. 2016. Building end-to-end dialogue systems using generative hierarchical neural network models. In AAAI.
    Google ScholarFindings
  • Shi, X.; Hu, H.; Che, W.; Sun, Z.; Liu, T.; and Huang, J. 2020. Understanding Medical Conversations with Scattered Keyword Attention and Weak Supervision from Responses. In AAAI, 8838–8845.
    Google ScholarLocate open access versionFindings
  • Song, K.; Tan, X.; Qin, T.; Lu, J.; and Liu, T.-Y. 2019. MASS: Masked Sequence to Sequence Pre-training for Language Generation. In ICML, 5926–5936.
    Google ScholarFindings
  • Sutskever, I.; Vinyals, O.; and Le, Q. V. 2014. Sequence to sequence learning with neural networks. In Advances in neural information processing systems, 3104–3112.
    Google ScholarLocate open access versionFindings
  • Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A. N.; Kaiser, Ł.; and Polosukhin, I. 2017. Attention is all you need. In Advances in neural information processing systems, 5998–6008.
    Google ScholarLocate open access versionFindings
  • Velickovic, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio, P.; and Bengio, Y. 2018. Graph Attention Networks. ICLR.
    Google ScholarLocate open access versionFindings
  • Wei, Z.; Liu, Q.; Peng, B.; Tou, H.; Chen, T.; Huang, X.J.; Wong, K.-F.; and Dai, X. 2018. Task-oriented dialogue system for automatic diagnosis. In ACL, 201–207.
    Google ScholarLocate open access versionFindings
  • Wu, J.; Xiong, W.; and Wang, W. Y. 2019. Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification. In EMNLP-IJCNLP, 4354–4364.
    Google ScholarFindings
  • Xu, L.; Zhou, Q.; Gong, K.; Liang, X.; Tang, J.; and Lin, L. 2019. End-to-end knowledge-routed relational dialogue system for automatic diagnosis. In AAAI, 7346–7353.
    Google ScholarFindings
  • Zhang, H.; Liu, Z.; Xiong, C.; and Liu, Z. 2020. Grounded conversation generation as guided traverses in commonsense knowledge graphs. In ACL, 2031–2043.
    Google ScholarFindings
  • Zhao, X.; Wu, W.; Tao, C.; Xu, C.; Zhao, D.; and Yan, R. 2019. Low-Resource Knowledge-Grounded Dialogue Generation. In ICLR.
    Google ScholarFindings
  • Zhou, H.; Young, T.; Huang, M.; Zhao, H.; Xu, J.; and Zhu, X. 2018. Commonsense knowledge aware conversation generation with graph attention. In IJCAI, 4623–4629.
    Google ScholarFindings
  • Zhou, P.; Yuan, X.; Xu, H.; and Yan, S. 2019. Efficient Meta Learning via Minibatch Proximal Update. In NeurIPS.
    Google ScholarFindings
  • Zhou, P.; Zou, Y.; Yuan, X.; Feng, J.; Xiong, C.; and Hoi, S. C. 2020. Task Similarity Aware Meta Learning: Theory-inspired Improvement on MAML. In 4th Workshop on Meta-Learning at NeurIPS 2020.
    Google ScholarLocate open access versionFindings
Full Text
Your rating :
0

 

Tags
Comments