Mind the Value-Action Gap: Do LLMs Act in Alignment with Their Values?
Hua Shen, Nicholas Clark, Tanushree Mitra·January 26, 2025
Summary
ValueActionLens, an evaluation framework, assesses large language models' alignment between stated values and actions across diverse cultures and topics. It uncovers significant gaps, with potential harms from misaligned results. The framework benefits from leveraging reasoned explanations to predict value-action discrepancies, highlighting risks in relying solely on stated values. It evaluates models' performance in generating value-informed actions, generating tasks, and measuring alignment distances. Results show substantial variations in alignment across values, cultures, and social topics.
Introduction
Background
Overview of large language models and their applications
Importance of value alignment in diverse cultural and social contexts
Objective
To introduce ValueActionLens as a comprehensive evaluation framework
Highlight the framework's role in identifying and mitigating misaligned results
Method
Data Collection
Selection of large language models for evaluation
Diverse cultural and social topic datasets
Data Preprocessing
Standardization of input prompts and output responses
Categorization of values and actions for analysis
Reasoned Explanations
Incorporation of human reasoning in evaluating model outputs
Use of reasoned explanations to predict value-action discrepancies
Performance Metrics
Evaluation of models' ability to generate value-informed actions
Assessment of task generation capabilities
Measurement of alignment distances across values, cultures, and topics
Results
Alignment Variations
Analysis of alignment across different values
Examination of cultural and social topic-specific discrepancies
Gaps and Harms
Identification of significant gaps in model performance
Discussion of potential harms from misaligned results
Model Performance
Comparison of different large language models
Insights into the effectiveness of ValueActionLens in evaluating models
Conclusion
Implications
Importance of value alignment in AI systems
Recommendations for improving model performance
Future Directions
Ongoing research and development in ValueActionLens
Integration of human oversight in AI decision-making processes
Basic info
papers
computation and language
human-computer interaction
artificial intelligence
Advanced features
Insights
What are the significant findings regarding the variations in alignment across values, cultures, and social topics according to ValueActionLens?
What is ValueActionLens and what does it evaluate in large language models?
How does ValueActionLens help in identifying potential harms from misaligned results in language models?
What are the key components of ValueActionLens that contribute to its effectiveness in assessing model performance?