Token-based Decision Criteria Are Suboptimal in In-context Learning
Hakaze Cho, Yoshihiro Sakai, Mariko Kato, Kenshiro Tanaka, Akira Ishii, Naoya Inoue·June 24, 2024
Summary
This paper investigates the limitations of token-based classification in In-Context Learning (ICL) for language models, particularly in terms of biases and under-calibration. The authors propose Hidden Calibration, which replaces token probabilities with nearest centroid classification on hidden states, improving performance by about 20% across 10 datasets. Hidden Calibration leverages the linear separability of language models' hidden representations and outperforms token-based methods, even with minimal computational cost. The study demonstrates that hidden state-based approaches, like Hidden Calibration, yield better classification criteria with lower inter-category overlap, and that the method benefits from demonstrations, enhancing linear separability and transferability across tasks with the same label space. The paper also suggests future research on automatic token selection and combining Hidden Calibration with other calibration techniques.
Introduction
Background
[1] Token-based classification in ICL limitations
[2] Biases and under-calibration issues
Objective
[3] To propose Hidden Calibration as a solution
[4] Aim for 20% performance improvement across 10 datasets
Method
Data Collection
[5] Selection of 10 diverse datasets for evaluation
Data Preprocessing
[6] Analysis of token probabilities and hidden states
[7] Identifying linear separability in hidden representations
Hidden Calibration Approach
[8] Replacing token probabilities with nearest centroid classification
[9] Minimal computational cost
[10] Leveraging linear separability
Performance Evaluation
[11] Comparison with token-based methods
[12] Better classification criteria and lower inter-category overlap
Transferability and Demonstrations
[13] Enhanced linear separability with demonstrations
[14] Transferability across tasks with same label space
Future Research Directions
[15] Automatic token selection
[16] Combining Hidden Calibration with other calibration techniques
Results and Discussion
[17] Improved performance statistics
[18] Case studies and real-world implications
Conclusion
[19] Summary of findings and contributions
[20] Implications for the field of language model calibration and ICL
References
[21] Cited works and literature review
Basic info
papers
computation and language
machine learning
artificial intelligence
Advanced features
Insights
What does the paper focus on in the context of In-Context Learning for language models?
What is the main advantage of Hidden Calibration over token-based methods, as mentioned in the study?
How does the use of demonstrations affect the performance of Hidden Calibration, according to the paper?
How much improvement does Hidden Calibration achieve compared to token-based classification?