Optimizing Large Language Models on Multi-Core CPUs: A Case Study of the BERT Model

APPLIED SCIENCES-BASEL(2024)

引用 0|浏览4
暂无评分
摘要
The BERT model is regarded as the cornerstone of various pre-trained large language models that have achieved promising results in recent years. This article investigates how to optimize the BERT model in terms of fine-tuning speed and prediction accuracy, aiming to accelerate the execution of the BERT model on a multi-core processor and improve its prediction accuracy in typical downstream natural language processing tasks. Our contributions are two-fold. First, we port and parallelize the fine-tuning training of the BERT model on a multi-core shared-memory processor. We port the BERT model onto a multi-core processor platform to accelerate the fine-tuning training process of the model for downstream tasks. Second, we improve the prediction performance of typical downstream natural language processing tasks through fine-tuning the model parameters. We select five typical downstream natural language processing tasks (CoLA, SST-2, MRPC, RTE, and WNLI) and perform optimization on the multi-core platform, taking the hyperparameters of batch size, learning rate, and training epochs into account. Our experimental results show that, by increasing the number of CPUs and the number of threads, the model training time can be significantly reduced. We observe that the reduced time is primarily concentrated in the self-attention mechanism. Our further experimental results show that setting reasonable hyperparameters can improve the accuracy of the BERT model when applied to downstream tasks and that appropriately increasing the batch size under conditions of sufficient computing resources can significantly reduce training time.
更多
查看译文
关键词
BERT model,model fine-tuning,performance optimization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要