ChatPapers: An AI Chatbot for Interacting with Academic Research

Max Dean,Raymond R. Bond,Michael F. McTear,Maurice D. Mulvenna

2023 31ST IRISH CONFERENCE ON ARTIFICIAL INTELLIGENCE AND COGNITIVE SCIENCE, AICS（2023）

引用 0|浏览1

暂无评分

摘要

A growing and significant number of computer science related papers are being published; hence it is challenging to keep up with the latest research. This paper describes the development of a large language model (LLM) augmentation chatbot and user interface that provides responses to research queries in the domain of computer science. Around 200,000 computer science research papers from arXiv were embedded, resulting in similar to 11 million vectors (based on 'chunks' from the papers). Each vector is comprised of 384 numbers/dimensions. Technologies used include Langchain, a Vector Database, and Semantic Searching with document / query embeddings. The chatbot was tested using 30 sample questions that could be asked by computer science students across several topics and from different education levels (i.e., BSc, MSc and PhD level). The responses from this chatbot were compared with those from GPT-4. The responses with and without prompting were also compared. Readability metrics (Flesch-Kincaid and Coleman-Liau) were used to compare the responses from this LLM with GPT-4. Retrieval Augmented Generation Assessment (RAGAS), a novel LLM self-evaluation method was used to evaluate the system. We observed that the developed system provides more suitable responses to the user based on the readability level at which the questions were asked.

查看译文

关键词

large language model,chatbot,retrieval augmented generation,Langchain,vector database,semantic search,GPT-4,Retrieval Augmented Generation Assessment,Readability metrics,Flesch-Kincaid,Coleman-Liau

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要