Pretraining and Updating Language- and Domain-specific Large Language Model: A Case Study in Japanese Business Domain
arxiv(2024)
摘要
Several previous studies have considered language- and domain-specific large
language models (LLMs) as separate topics. This study explores the combination
of a non-English language and a high-demand industry domain, focusing on a
Japanese business-specific LLM. This type of a model requires expertise in the
business domain, strong language skills, and regular updates of its knowledge.
We trained a 13-billion-parameter LLM from scratch using a new dataset of
business texts and patents, and continually pretrained it with the latest
business documents. Further we propose a new benchmark for Japanese business
domain question answering (QA) and evaluate our models on it. The results show
that our pretrained model improves QA accuracy without losing general
knowledge, and that continual pretraining enhances adaptation to new
information. Our pretrained model and business domain benchmark are publicly
available.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要