Interpretable Catastrophic Forgetting of Large Language Model Fine-tuning via Instruction Vector
Gangwei Jiang, Zhaoyi Li, Caigao Jiang, Siqiao Xue, Jun Zhou, Linqi Song, Defu Lian, Ying Wei·June 18, 2024
Summary
This collection of papers investigates catastrophic forgetting in large language models during fine-tuning, focusing on instruction following and knowledge understanding. Key findings include:
1. The study distinguishes between knowledge and instruction probabilities, revealing that forgetting mainly occurs in following instructions rather than retaining general knowledge.
2. The Instruction Vector (IV) framework is introduced to analyze the model's internal changes, showing that fine-tuning primarily adds specialized reasoning without erasing previous skills, which may appear as forgetting.
3. IV-guided training is developed to preserve the original computation graph, mitigating forgetting by aligning new learning with the model's original capabilities.
4. Experiments on TRACE, FUNC, and other benchmarks demonstrate the effectiveness of IV-guided methods in reducing forgetting and maintaining performance on various tasks.
5. The role of IVs in tasks like CommonsenseQA, Last-Spanish, and zero-shot learning is explored, highlighting their impact on model performance and the need for maintaining a balance between existing and new task-specific computations.
In conclusion, the research provides insights into the mechanisms of catastrophic forgetting in large language models and proposes novel approaches to address it, emphasizing the importance of preserving instruction understanding and the computation graph during fine-tuning.
Advanced features