Physics of Language Models: Part 2.2, How to Learn From Mistakes on Grade-School Math Problems

Tian Ye, Zicheng Xu, Yuanzhi Li, Zeyuan Allen-Zhu·August 29, 2024

Summary

Language models, though proficient in reasoning, still commit errors. A study explores pretraining with error-correction data, consisting of incorrect solution steps followed by corrections. This approach helps models achieve higher reasoning accuracy on synthetic math datasets compared to pretraining with error-free data. The research investigates various aspects, including the differences from beam search, data preparation, and the necessity of masking. Challenges include the model potentially learning mistakes instead of correct steps and unclear benefits over training on error-free data. The study uses the iGSM dataset for reliable error and correction generation, aiming to understand if such training can teach models error correction, potentially leading to higher reasoning accuracy compared to error-free data training.

Key findings

7

Advanced features