AstroLLaMA: Towards Specialized Foundation Models in Astronomy

Tuan Dung Nguyen,Yuan-Sen Ting,Ioana Ciucă, Charlie O'Neill,Ze-Chang Sun,Maja Jabłońska,Sandor Kruk, Ernest Perkowski,Jack Miller,Jason Li,Josh Peek,Kartheik Iyer,Tomasz Różański, Pranav Khetarpal, Sharaf Zaman, David Brodrick, Sergio J. Rodríguez Méndez,Thang Bui,Alyssa Goodman,Alberto Accomazzi,Jill Naiman,Jesse Cranney,Kevin Schawinski, UniverseTBD

Proceedings of the Second Workshop on Information Extraction from Scientific Publications（2023）

引用 3|浏览1350

暂无评分

摘要

Large language models excel in many human-language tasks but often falter in highly specialized domains like scholarly astronomy. To bridge this gap, we introduce AstroLLaMA, a 7-billion-parameter model fine-tuned from LLaMA-2 using over 300,000 astronomy abstracts from arXiv. Optimized for traditional causal language modeling, AstroLLaMA achieves a 30% lower perplexity than Llama-2, showing marked domain adaptation. Our model generates more insightful and scientifically relevant text completions and embedding extraction than state-of-the-arts foundation models despite having significantly fewer parameters. AstroLLaMA serves as a robust, domain-specific model with broad fine-tuning potential. Its public release aims to spur astronomy-focused research, including automatic paper summarization and conversational agent development.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要