Large Language Models Offer an Alternative to the Traditional Approach of Topic Modelling
CoRR(2024)
摘要
Topic modelling, as a well-established unsupervised technique, has found
extensive use in automatically detecting significant topics within a corpus of
documents. However, classic topic modelling approaches (e.g., LDA) have certain
drawbacks, such as the lack of semantic understanding and the presence of
overlapping topics. In this work, we investigate the untapped potential of
large language models (LLMs) as an alternative for uncovering the underlying
topics within extensive text corpora. To this end, we introduce a framework
that prompts LLMs to generate topics from a given set of documents and
establish evaluation protocols to assess the clustering efficacy of LLMs. Our
findings indicate that LLMs with appropriate prompts can stand out as a viable
alternative, capable of generating relevant topic titles and adhering to human
guidelines to refine and merge topics. Through in-depth experiments and
evaluation, we summarise the advantages and constraints of employing LLMs in
topic extraction.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要