Interpretable Catastrophic Forgetting of Large Language Model Fine-tuning via Instruction Vector

Gangwei Jiang, Zhaoyi Li, Caigao Jiang, Siqiao Xue, Jun Zhou, Linqi Song, Defu Lian, Ying Wei·June 18, 2024

Summary

This collection of papers investigates catastrophic forgetting in large language models during fine-tuning, focusing on instruction following and knowledge understanding. Key findings include: 1. The study distinguishes between knowledge and instruction probabilities, revealing that forgetting mainly occurs in following instructions rather than retaining general knowledge. 2. The Instruction Vector (IV) framework is introduced to analyze the model's internal changes, showing that fine-tuning primarily adds specialized reasoning without erasing previous skills, which may appear as forgetting. 3. IV-guided training is developed to preserve the original computation graph, mitigating forgetting by aligning new learning with the model's original capabilities. 4. Experiments on TRACE, FUNC, and other benchmarks demonstrate the effectiveness of IV-guided methods in reducing forgetting and maintaining performance on various tasks. 5. The role of IVs in tasks like CommonsenseQA, Last-Spanish, and zero-shot learning is explored, highlighting their impact on model performance and the need for maintaining a balance between existing and new task-specific computations. In conclusion, the research provides insights into the mechanisms of catastrophic forgetting in large language models and proposes novel approaches to address it, emphasizing the importance of preserving instruction understanding and the computation graph during fine-tuning.

Key findings

5

Paper digest

What problem does the paper attempt to solve? Is this a new problem?

The paper aims to address the issue of catastrophic forgetting in Large Language Models (LLMs) during fine-tuning by introducing the concept of Instruction Vector (IV) to analyze task processing capabilities and mitigate forgetting . This problem is not entirely new, as previous research has also explored catastrophic forgetting in neural networks . The paper focuses on understanding how LLMs acquire new capacity during fine-tuning, the mechanisms underlying forgetting, and proposes IV-guided training as a method to reduce forgetting and maintain model performance .


What scientific hypothesis does this paper seek to validate?

This paper aims to validate the Instruction Vector hypothesis in the context of large language models (LLMs) to understand the phenomenon of catastrophic forgetting . The study introduces the Instruction Vector (IV) to analyze LLMs' task processing capabilities and investigates how forgetting occurs due to the overlay of new reasoning patterns over pre-existing skills. The research shows that adding the IV to the computation graph can help recover performance and reduce forgetting by maintaining harmony between the model's computation graph and the IV-associated one .


What new ideas, methods, or models does the paper propose? What are the characteristics and advantages compared to previous methods?

The paper introduces the concept of Instruction Vector (IV) as a method to address catastrophic forgetting in Large Language Models (LLMs) during fine-tuning processes. The IV enables detailed analysis of LLMs' task processing capabilities by examining the dynamics of IV before and after training. The study reveals that forgetting occurs due to the overlay of new reasoning patterns over pre-existing skills, and performance can be restored by incorporating the IV into the computation graph. Additionally, the paper proposes IV-guided training as a fine-tuning approach to reduce forgetting by maintaining harmony between the model's computation graph and the IV-associated one .

The research emphasizes the importance of understanding the internal mechanisms causing forgetting in LLMs during fine-tuning. By analyzing the IV dynamics and aligning them with the model's computation graph, the proposed method aims to minimize the impact of newly introduced knowledge on past abilities and knowledge, ensuring a robust computation graph post-fine-tuning .

Furthermore, the paper evaluates the effectiveness of the proposed IV-guided fine-tuning approach on various benchmarks such as TRACE, FUNC, and LONG sequence continual learning benchmarks. The experiments demonstrate the efficacy of the method in combination with existing continual learning methods like incremental Lora, Learning without forgetting, Elastic weight consolidation, and Orthogonal Lora. The study measures the shift in general capabilities, forgetting in reasoning abilities, and the degree of catastrophic forgetting on newly learned abilities to assess the performance of the proposed algorithms . The proposed Instruction Vector (IV) method in the paper offers several key characteristics and advantages compared to previous methods for addressing catastrophic forgetting in Large Language Models (LLMs) during fine-tuning processes.

  1. Detailed Analysis of Task Processing Capabilities: The IV method enables a detailed analysis of LLMs' task processing capabilities by examining the dynamics of IV before and after training. This analysis helps in understanding how forgetting occurs due to the overlay of new reasoning patterns over pre-existing skills .

  2. Preservation of Computation Graph: The IV-guided fine-tuning approach ensures that the model retains a robust computation graph after fine-tuning, minimizing the impact of newly introduced knowledge on past abilities and knowledge. By aligning the model's computation graph with the IV-associated one, the method effectively reduces forgetting .

  3. Effectiveness in Mitigating Forgetting: The IV-guided training method effectively mitigates forgetting in LLMs, as demonstrated by its ability to reduce the forgetting rate on general capabilities and enhance in-context performance. It maintains a balance between preserving existing capabilities and learning new knowledge, showcasing its effectiveness in handling different learning challenges .

  4. Maintained Adaptability: The IV-guided training method does not compromise the plasticity in learning new tasks. It shows only a slight reduction in adaptability metrics compared to existing methods like Learning without Forgetting (Lwf), ensuring that the model can efficiently learn new tasks without significant drops in performance .

  5. Robustness Across Task Complexities: The IV-guided training method demonstrates robustness in handling tasks of varying complexities. It effectively manages tasks across different benchmarks, from simpler to more complex tasks, showcasing its ability to adapt and learn efficiently across different learning challenges .

Overall, the IV method stands out for its ability to provide a detailed analysis of task processing capabilities, preserve the computation graph, mitigate forgetting, maintain adaptability, and demonstrate robustness across different task complexities, offering a promising approach to address catastrophic forgetting in LLMs during fine-tuning processes.


Do any related researches exist? Who are the noteworthy researchers on this topic in this field?What is the key to the solution mentioned in the paper?

Several related research studies exist in the field of catastrophic forgetting in large language models. Noteworthy researchers in this area include Shih-Cheng Huang, Pin-Zu Li, Yu-Chi Hsu, Kuang-Ming Chen, Yu Tung Lin, Shih-Kai Hsiao, Richard Tzong-Han Tsai, and Hung yi Lee , as well as researchers like Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, and others . These researchers have contributed to understanding catastrophic forgetting in language models and proposing solutions to mitigate this issue.

The key to the solution mentioned in the paper is the introduction of the Instruction Vector (IV). The IV enables detailed analysis of Large Language Models' (LLMs) task processing capabilities. By analyzing the dynamics of IV before and after training, the study shows that forgetting in LLMs is caused by overlaying new reasoning patterns over pre-existing skills. The performance can be recovered by adding the IV to the computation graph. IV-guided training as a fine-tuning method successfully reduces forgetting by maintaining harmony between the model's computation graph and the IV-associated one .


How were the experiments in the paper designed?

The experiments in the paper were designed by testing the newly proposed method on TRACE and FUNC benchmarks, along with a LONG sequence continual learning benchmark with 15 tasks. For the evaluation set, the experiments utilized datasets such as Hellaswag, ARC-challenge, CommonsenseQA, and MMLU-social. The experiments were conducted on the Llama2-7B-chat model, showcasing its effectiveness in combination with existing continual learning methods like IncLora, Lwf, Ewc, and OLora . The experiments involved training the model with specific hyperparameters mentioned in previous works, loading the base LM into torch.bfloat16 to save memory, and running the experiments on 4 NVIDIA A100 GPUs . The performance of the proposed algorithms was evaluated based on average zero-shot held-out performance, average in-content held-out performance, and overall training performance to measure the shift in general capabilities, forgetting in reasoning abilities, and the degree of catastrophic forgetting on newly learned abilities .


What is the dataset used for quantitative evaluation? Is the code open source?

The dataset used for quantitative evaluation includes various datasets such as ScienceQA, FOMC, MeetingBank, C-STANCE, Py150, NumGLUE-cm, Yelp, SST2, Amazon, IMDB, DBpedia, Yahoo, AG News, WiC, QQP, RTE, MNLI, CB, COPA, BoolQ, and MultiRC . The code for the datasets is open source and can be accessed from the OpenCompass code framework .


Do the experiments and results in the paper provide good support for the scientific hypotheses that need to be verified? Please analyze.

The experiments and results presented in the paper provide strong support for the scientific hypotheses that needed verification. The study conducted a series of intervention experiments to assess the effectiveness of the Instruction Vector (IV) in fine-tuning large language models . These experiments demonstrated that the IV significantly influences the output behavior for specific tasks, leading to notable improvements in zero-shot performance . Additionally, the study evaluated the Instruction Vector Guided (IVG) training method across different benchmarks, showing performance improvements in various scenarios . The detailed analysis of the model configurations and the impact of the IV on accuracy levels provided valuable insights into the effectiveness of the proposed approach .

Furthermore, the research explored the task-specific knowledge and the importance of following instructions through Knowledge Probability assessments, which are crucial aspects in verifying scientific hypotheses . By conducting experiments that involved causal mediation analysis and assessing the causal effect of specific attention heads on model output, the study effectively verified the hypotheses related to the impact of task-specific activations on model performance . The detailed procedure outlined in the paper, along with the results obtained from different model conditions, provided a robust foundation for validating the scientific hypotheses .

In conclusion, the experiments and results presented in the paper offer substantial support for the scientific hypotheses that needed verification. The thorough evaluation of the Instruction Vector, the performance improvements observed with IVG training, and the analysis of task-specific knowledge and instructions through Knowledge Probability assessments collectively contribute to the strong validation of the scientific hypotheses in the study .


What are the contributions of this paper?

The contributions of the paper "Interpretable Catastrophic Forgetting of Large Language Model Fine-tuning via Instruction Vector" include:

  • Introducing the concept of Instruction Vector (IV) for detailed analysis of Large Language Models (LLMs) task processing capabilities .
  • Analyzing IV dynamics before and after training to show that forgetting in LLMs is caused by overlaying new reasoning patterns over pre-existing skills, with the ability to recover performance by adding the IV to the computation graph .
  • Proposing IV-guided training as a fine-tuning method to reduce forgetting by maintaining harmony between the model's computation graph and the IV-associated one, providing valuable insights into the internal mechanisms causing forgetting in LLMs .

What work can be continued in depth?

Further in-depth work can be conducted on analyzing the internal mechanisms of fine-tuning in large language models (LLMs) to understand how they acquire new capacity during the learning process . This analysis can focus on areas such as the minimal transformation on top of the original capability, subtractable and reusable parameter shift vectors, and aligning input queries with internal knowledge acquired during pre-training . Additionally, exploring the reasons behind the forgetting issue caused by fine-tuning and developing methods to prevent forgetting by maintaining harmony between the model's computation graph and the Instruction Vector (IV) can be valuable areas for continued research .


Introduction
Background
Overview of catastrophic forgetting in LLMs during fine-tuning
Importance of retaining knowledge and instruction understanding
Objective
To understand the distinction between knowledge and instruction probabilities
To introduce the Instruction Vector (IV) framework
To propose IV-guided training to mitigate forgetting
Method
Data Collection
Selection of large language models for study
Fine-tuning datasets: Instruction Following, Knowledge Understanding
Data Preprocessing
Analysis of model performance before and after fine-tuning
Extraction of Instruction Vectors (IVs)
Instruction Vector (IV) Framework
IV Analysis
Identifying changes in the model's internal structure
Differentiating between knowledge retention and instruction forgetting
IV-guided Training
Development of a method to preserve original computation graph
Aligning new learning with existing skills
Experimental Setup
Benchmarks: TRACE, FUNC, CommonsenseQA, Last-Spanish
Evaluation of IV-guided methods on task performance and forgetting reduction
Results and Findings
Knowledge vs Instruction Probabilities
Quantitative analysis of forgetting trends
Evidence of specialized reasoning addition rather than complete skill erasure
IV-guided Training Effectiveness
Improved performance on instruction following tasks
Reduced forgetting across various benchmarks
Impact on Other Tasks
Zero-shot learning and task-specific computations
Balancing existing and new task-specific knowledge
Conclusion
Mechanisms of catastrophic forgetting in LLMs during fine-tuning
The significance of preserving instruction understanding
Future directions and implications for fine-tuning strategies
Basic info
papers
artificial intelligence
Advanced features
Insights
How does the study differentiate between knowledge and instruction probabilities in the context of forgetting during fine-tuning?
What method does the study propose, IV-guided training, and what is its purpose in mitigating catastrophic forgetting?
What is the primary focus of the paper on catastrophic forgetting in large language models?
What is the Instruction Vector (IV) framework, and how does it contribute to understanding the model's changes during fine-tuning?

Interpretable Catastrophic Forgetting of Large Language Model Fine-tuning via Instruction Vector

Gangwei Jiang, Zhaoyi Li, Caigao Jiang, Siqiao Xue, Jun Zhou, Linqi Song, Defu Lian, Ying Wei·June 18, 2024

Summary

This collection of papers investigates catastrophic forgetting in large language models during fine-tuning, focusing on instruction following and knowledge understanding. Key findings include: 1. The study distinguishes between knowledge and instruction probabilities, revealing that forgetting mainly occurs in following instructions rather than retaining general knowledge. 2. The Instruction Vector (IV) framework is introduced to analyze the model's internal changes, showing that fine-tuning primarily adds specialized reasoning without erasing previous skills, which may appear as forgetting. 3. IV-guided training is developed to preserve the original computation graph, mitigating forgetting by aligning new learning with the model's original capabilities. 4. Experiments on TRACE, FUNC, and other benchmarks demonstrate the effectiveness of IV-guided methods in reducing forgetting and maintaining performance on various tasks. 5. The role of IVs in tasks like CommonsenseQA, Last-Spanish, and zero-shot learning is explored, highlighting their impact on model performance and the need for maintaining a balance between existing and new task-specific computations. In conclusion, the research provides insights into the mechanisms of catastrophic forgetting in large language models and proposes novel approaches to address it, emphasizing the importance of preserving instruction understanding and the computation graph during fine-tuning.
Mind map
Aligning new learning with existing skills
Development of a method to preserve original computation graph
Differentiating between knowledge retention and instruction forgetting
Identifying changes in the model's internal structure
Balancing existing and new task-specific knowledge
Zero-shot learning and task-specific computations
Reduced forgetting across various benchmarks
Improved performance on instruction following tasks
Evidence of specialized reasoning addition rather than complete skill erasure
Quantitative analysis of forgetting trends
Evaluation of IV-guided methods on task performance and forgetting reduction
Benchmarks: TRACE, FUNC, CommonsenseQA, Last-Spanish
IV-guided Training
IV Analysis
Extraction of Instruction Vectors (IVs)
Analysis of model performance before and after fine-tuning
Fine-tuning datasets: Instruction Following, Knowledge Understanding
Selection of large language models for study
To propose IV-guided training to mitigate forgetting
To introduce the Instruction Vector (IV) framework
To understand the distinction between knowledge and instruction probabilities
Importance of retaining knowledge and instruction understanding
Overview of catastrophic forgetting in LLMs during fine-tuning
Future directions and implications for fine-tuning strategies
The significance of preserving instruction understanding
Mechanisms of catastrophic forgetting in LLMs during fine-tuning
Impact on Other Tasks
IV-guided Training Effectiveness
Knowledge vs Instruction Probabilities
Experimental Setup
Instruction Vector (IV) Framework
Data Preprocessing
Data Collection
Objective
Background
Conclusion
Results and Findings
Method
Introduction
Outline
Introduction
Background
Overview of catastrophic forgetting in LLMs during fine-tuning
Importance of retaining knowledge and instruction understanding
Objective
To understand the distinction between knowledge and instruction probabilities
To introduce the Instruction Vector (IV) framework
To propose IV-guided training to mitigate forgetting
Method
Data Collection
Selection of large language models for study
Fine-tuning datasets: Instruction Following, Knowledge Understanding
Data Preprocessing
Analysis of model performance before and after fine-tuning
Extraction of Instruction Vectors (IVs)
Instruction Vector (IV) Framework
IV Analysis
Identifying changes in the model's internal structure
Differentiating between knowledge retention and instruction forgetting
IV-guided Training
Development of a method to preserve original computation graph
Aligning new learning with existing skills
Experimental Setup
Benchmarks: TRACE, FUNC, CommonsenseQA, Last-Spanish
Evaluation of IV-guided methods on task performance and forgetting reduction
Results and Findings
Knowledge vs Instruction Probabilities
Quantitative analysis of forgetting trends
Evidence of specialized reasoning addition rather than complete skill erasure
IV-guided Training Effectiveness
Improved performance on instruction following tasks
Reduced forgetting across various benchmarks
Impact on Other Tasks
Zero-shot learning and task-specific computations
Balancing existing and new task-specific knowledge
Conclusion
Mechanisms of catastrophic forgetting in LLMs during fine-tuning
The significance of preserving instruction understanding
Future directions and implications for fine-tuning strategies
Key findings
5

Paper digest

What problem does the paper attempt to solve? Is this a new problem?

The paper aims to address the issue of catastrophic forgetting in Large Language Models (LLMs) during fine-tuning by introducing the concept of Instruction Vector (IV) to analyze task processing capabilities and mitigate forgetting . This problem is not entirely new, as previous research has also explored catastrophic forgetting in neural networks . The paper focuses on understanding how LLMs acquire new capacity during fine-tuning, the mechanisms underlying forgetting, and proposes IV-guided training as a method to reduce forgetting and maintain model performance .


What scientific hypothesis does this paper seek to validate?

This paper aims to validate the Instruction Vector hypothesis in the context of large language models (LLMs) to understand the phenomenon of catastrophic forgetting . The study introduces the Instruction Vector (IV) to analyze LLMs' task processing capabilities and investigates how forgetting occurs due to the overlay of new reasoning patterns over pre-existing skills. The research shows that adding the IV to the computation graph can help recover performance and reduce forgetting by maintaining harmony between the model's computation graph and the IV-associated one .


What new ideas, methods, or models does the paper propose? What are the characteristics and advantages compared to previous methods?

The paper introduces the concept of Instruction Vector (IV) as a method to address catastrophic forgetting in Large Language Models (LLMs) during fine-tuning processes. The IV enables detailed analysis of LLMs' task processing capabilities by examining the dynamics of IV before and after training. The study reveals that forgetting occurs due to the overlay of new reasoning patterns over pre-existing skills, and performance can be restored by incorporating the IV into the computation graph. Additionally, the paper proposes IV-guided training as a fine-tuning approach to reduce forgetting by maintaining harmony between the model's computation graph and the IV-associated one .

The research emphasizes the importance of understanding the internal mechanisms causing forgetting in LLMs during fine-tuning. By analyzing the IV dynamics and aligning them with the model's computation graph, the proposed method aims to minimize the impact of newly introduced knowledge on past abilities and knowledge, ensuring a robust computation graph post-fine-tuning .

Furthermore, the paper evaluates the effectiveness of the proposed IV-guided fine-tuning approach on various benchmarks such as TRACE, FUNC, and LONG sequence continual learning benchmarks. The experiments demonstrate the efficacy of the method in combination with existing continual learning methods like incremental Lora, Learning without forgetting, Elastic weight consolidation, and Orthogonal Lora. The study measures the shift in general capabilities, forgetting in reasoning abilities, and the degree of catastrophic forgetting on newly learned abilities to assess the performance of the proposed algorithms . The proposed Instruction Vector (IV) method in the paper offers several key characteristics and advantages compared to previous methods for addressing catastrophic forgetting in Large Language Models (LLMs) during fine-tuning processes.

  1. Detailed Analysis of Task Processing Capabilities: The IV method enables a detailed analysis of LLMs' task processing capabilities by examining the dynamics of IV before and after training. This analysis helps in understanding how forgetting occurs due to the overlay of new reasoning patterns over pre-existing skills .

  2. Preservation of Computation Graph: The IV-guided fine-tuning approach ensures that the model retains a robust computation graph after fine-tuning, minimizing the impact of newly introduced knowledge on past abilities and knowledge. By aligning the model's computation graph with the IV-associated one, the method effectively reduces forgetting .

  3. Effectiveness in Mitigating Forgetting: The IV-guided training method effectively mitigates forgetting in LLMs, as demonstrated by its ability to reduce the forgetting rate on general capabilities and enhance in-context performance. It maintains a balance between preserving existing capabilities and learning new knowledge, showcasing its effectiveness in handling different learning challenges .

  4. Maintained Adaptability: The IV-guided training method does not compromise the plasticity in learning new tasks. It shows only a slight reduction in adaptability metrics compared to existing methods like Learning without Forgetting (Lwf), ensuring that the model can efficiently learn new tasks without significant drops in performance .

  5. Robustness Across Task Complexities: The IV-guided training method demonstrates robustness in handling tasks of varying complexities. It effectively manages tasks across different benchmarks, from simpler to more complex tasks, showcasing its ability to adapt and learn efficiently across different learning challenges .

Overall, the IV method stands out for its ability to provide a detailed analysis of task processing capabilities, preserve the computation graph, mitigate forgetting, maintain adaptability, and demonstrate robustness across different task complexities, offering a promising approach to address catastrophic forgetting in LLMs during fine-tuning processes.


Do any related researches exist? Who are the noteworthy researchers on this topic in this field?What is the key to the solution mentioned in the paper?

Several related research studies exist in the field of catastrophic forgetting in large language models. Noteworthy researchers in this area include Shih-Cheng Huang, Pin-Zu Li, Yu-Chi Hsu, Kuang-Ming Chen, Yu Tung Lin, Shih-Kai Hsiao, Richard Tzong-Han Tsai, and Hung yi Lee , as well as researchers like Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, and others . These researchers have contributed to understanding catastrophic forgetting in language models and proposing solutions to mitigate this issue.

The key to the solution mentioned in the paper is the introduction of the Instruction Vector (IV). The IV enables detailed analysis of Large Language Models' (LLMs) task processing capabilities. By analyzing the dynamics of IV before and after training, the study shows that forgetting in LLMs is caused by overlaying new reasoning patterns over pre-existing skills. The performance can be recovered by adding the IV to the computation graph. IV-guided training as a fine-tuning method successfully reduces forgetting by maintaining harmony between the model's computation graph and the IV-associated one .


How were the experiments in the paper designed?

The experiments in the paper were designed by testing the newly proposed method on TRACE and FUNC benchmarks, along with a LONG sequence continual learning benchmark with 15 tasks. For the evaluation set, the experiments utilized datasets such as Hellaswag, ARC-challenge, CommonsenseQA, and MMLU-social. The experiments were conducted on the Llama2-7B-chat model, showcasing its effectiveness in combination with existing continual learning methods like IncLora, Lwf, Ewc, and OLora . The experiments involved training the model with specific hyperparameters mentioned in previous works, loading the base LM into torch.bfloat16 to save memory, and running the experiments on 4 NVIDIA A100 GPUs . The performance of the proposed algorithms was evaluated based on average zero-shot held-out performance, average in-content held-out performance, and overall training performance to measure the shift in general capabilities, forgetting in reasoning abilities, and the degree of catastrophic forgetting on newly learned abilities .


What is the dataset used for quantitative evaluation? Is the code open source?

The dataset used for quantitative evaluation includes various datasets such as ScienceQA, FOMC, MeetingBank, C-STANCE, Py150, NumGLUE-cm, Yelp, SST2, Amazon, IMDB, DBpedia, Yahoo, AG News, WiC, QQP, RTE, MNLI, CB, COPA, BoolQ, and MultiRC . The code for the datasets is open source and can be accessed from the OpenCompass code framework .


Do the experiments and results in the paper provide good support for the scientific hypotheses that need to be verified? Please analyze.

The experiments and results presented in the paper provide strong support for the scientific hypotheses that needed verification. The study conducted a series of intervention experiments to assess the effectiveness of the Instruction Vector (IV) in fine-tuning large language models . These experiments demonstrated that the IV significantly influences the output behavior for specific tasks, leading to notable improvements in zero-shot performance . Additionally, the study evaluated the Instruction Vector Guided (IVG) training method across different benchmarks, showing performance improvements in various scenarios . The detailed analysis of the model configurations and the impact of the IV on accuracy levels provided valuable insights into the effectiveness of the proposed approach .

Furthermore, the research explored the task-specific knowledge and the importance of following instructions through Knowledge Probability assessments, which are crucial aspects in verifying scientific hypotheses . By conducting experiments that involved causal mediation analysis and assessing the causal effect of specific attention heads on model output, the study effectively verified the hypotheses related to the impact of task-specific activations on model performance . The detailed procedure outlined in the paper, along with the results obtained from different model conditions, provided a robust foundation for validating the scientific hypotheses .

In conclusion, the experiments and results presented in the paper offer substantial support for the scientific hypotheses that needed verification. The thorough evaluation of the Instruction Vector, the performance improvements observed with IVG training, and the analysis of task-specific knowledge and instructions through Knowledge Probability assessments collectively contribute to the strong validation of the scientific hypotheses in the study .


What are the contributions of this paper?

The contributions of the paper "Interpretable Catastrophic Forgetting of Large Language Model Fine-tuning via Instruction Vector" include:

  • Introducing the concept of Instruction Vector (IV) for detailed analysis of Large Language Models (LLMs) task processing capabilities .
  • Analyzing IV dynamics before and after training to show that forgetting in LLMs is caused by overlaying new reasoning patterns over pre-existing skills, with the ability to recover performance by adding the IV to the computation graph .
  • Proposing IV-guided training as a fine-tuning method to reduce forgetting by maintaining harmony between the model's computation graph and the IV-associated one, providing valuable insights into the internal mechanisms causing forgetting in LLMs .

What work can be continued in depth?

Further in-depth work can be conducted on analyzing the internal mechanisms of fine-tuning in large language models (LLMs) to understand how they acquire new capacity during the learning process . This analysis can focus on areas such as the minimal transformation on top of the original capability, subtractable and reusable parameter shift vectors, and aligning input queries with internal knowledge acquired during pre-training . Additionally, exploring the reasons behind the forgetting issue caused by fine-tuning and developing methods to prevent forgetting by maintaining harmony between the model's computation graph and the Instruction Vector (IV) can be valuable areas for continued research .

Scan the QR code to ask more questions about the paper
© 2025 Powerdrill. All rights reserved.