GPT-4 Surpassing Human Performance in Linguistic Pragmatics
CoRR(2023)
摘要
As Large Language Models (LLMs) become increasingly integrated into everyday
life, their capabilities to understand and emulate human cognition are under
steady examination. This study investigates the ability of LLMs to comprehend
and interpret linguistic pragmatics, an aspect of communication that considers
context and implied meanings. Using Grice's communication principles, LLMs and
human subjects (N=76) were evaluated based on their responses to various
dialogue-based tasks. The findings revealed the superior performance and speed
of LLMs, particularly GPT4, over human subjects in interpreting pragmatics.
GPT4 also demonstrated accuracy in the pre-testing of human-written samples,
indicating its potential in text analysis. In a comparative analysis of LLMs
using human individual and average scores, the models exhibited significant
chronological improvement. The models were ranked from lowest to highest score,
with GPT2 positioned at 78th place, GPT3 ranking at 23rd, Bard at 10th, GPT3.5
placing 5th, Best Human scoring 2nd, and GPT4 achieving the top spot. The
findings highlight the remarkable progress made in the development and
performance of these LLMs. Future studies should consider diverse subjects,
multiple languages, and other cognitive aspects to fully comprehend the
capabilities of LLMs. This research holds significant implications for the
development and application of AI-based models in communication-centered
sectors.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要