Least-to-Most Prompting Enables Complex Reasoning in Large Language Models

ICLR 2023(2022)

引用 373|浏览221
Although chain-of-thought prompting has shown impressive results on many natural language reasoning tasks, it often performs poorly on tasks which need to solve problems harder than the demonstration examples. To tackle such easy-to-hard generalization issues, we propose a novel prompting strategy, least-to-most prompting. It reduces a complex problem into a list of subproblems, and then sequentially solve these subproblems, whereby solving a given subproblem is facilitated by the answers to previously solved subproblems. Experiments on symbolic manipulation, compositional generalization and math reasoning show that least-to-most prompting can generalize to the examples that are harder than those seen in the prompt, and outperform chain-of-thought prompting by a large margin. A notable result is that the GPT-3 code-davinci-002 model with least-to-most-prompting solves the SCAN benchmark regardless of splits (such as length split) with an accuracy of 99.7% using 14 examples versus an accuracy of 16.2% by chain-of-thought prompting, and neural-symbolic models in the literature specialized for solving SCAN are trained with the full training set of more than 15,000 examples.
large language models,natural language processing,prompting,reasoning,compositional generalization
AI 理解论文