Do Large Code Models Understand Programming Concepts? A Black-box Approach
CoRR(2024)
摘要
Large Language Models' success on text generation has also made them better
at code generation and coding tasks. While a lot of work has demonstrated their
remarkable performance on tasks such as code completion and editing, it is
still unclear as to why. We help bridge this gap by exploring to what degree
auto-regressive models understand the logical constructs of the underlying
programs. We propose Counterfactual Analysis for Programming Concept Predicates
(CACP) as a counterfactual testing framework to evaluate whether Large Code
Models understand programming concepts. With only black-box access to the
model, we use CACP to evaluate ten popular Large Code Models for four different
programming concepts. Our findings suggest that current models lack
understanding of concepts such as data flow and control flow.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要