API Entity and Relation Joint Extraction from Text via Dynamic Prompt-tuned Language Model

ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY(2024)

引用 7|浏览116
暂无评分
摘要
Extraction of Application Programming Interfaces (APIs) and their semantic relations from unstructured text (e.g., Stack Overflow) is a fundamental work for software engineering tasks (e.g., API recommendation). However, existing approaches are rule based and sequence labeling based. They must manually enumerate the rules or label data for a wide range of sentence patterns, which involves a significant amount of labor overhead and is exacerbated by morphological and common-word ambiguity. In contrast to matching or labeling API entities and relations, this article formulates heterogeneous API extraction and API relation extraction task as a sequence-to-sequence generation task and proposes the API Entity-Relation Joint Extraction framework (AERJE), an API entity-relation joint extraction model based on the large pre-trained language model. After training on a small number of ambiguous but correctly labeled data, AERJE builds a multi-task architecture that extracts API entities and relations from unstructured text using dynamic prompts. We systematically evaluate AERJE on a set of long and ambiguous sentences from Stack Overflow. The experimental results show that AERJE achieves high accuracy and discrimination ability in API entity-relation joint extraction, even with zero or few-shot fine-tuning.
更多
查看译文
关键词
API entity,API relation,joint extraction,dynamic prompt
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要