Faithful Chart Summarization with ChaTS-Pi

Syrine Krichene, Francesco Piccinno, Fangyu Liu, Julian Martin Eisenschlos·May 29, 2024

Summary

The paper presents CHATS-CRITIC and CHATS-PI, reference-free chart summarization tools that address the issue of factual accuracy in generating text from charts. CHATS-CRITIC uses an image-to-text model and a tabular entailment model to evaluate faithfulness, outperforming existing metrics in aligning with human ratings. CHATS-PI, a pipeline, integrates CHATS-CRITIC to refine and rank summaries by removing unsupported sentences. The study showcases the effectiveness of these models on two datasets, with CHATS-PI achieving state-of-the-art results. The research also highlights the limitations of reference-based metrics and the importance of evaluating factual correctness in chart summarization tasks.

Key findings

7
  • header
  • header
  • header
  • header
  • header
  • header
  • header

Tables

1

Introduction
Background
Evolution of chart summarization tools
Importance of factual accuracy in generated text
Objective
Development of CHATS-CRITIC and CHATS-PI
Aim to improve factual accuracy and outperform existing metrics
Methodology
CHATS-CRITIC
Image-to-Text Model
Architecture and training process
Tabular Entailment Model
Model design and entailment evaluation
Performance Evaluation
Human ratings comparison and benchmarking
CHATS-PI: Pipeline Approach
Integration of CHATS-CRITIC
Refinement and ranking process
Effectiveness on Summaries
Dataset application and results
Results and Evaluation
Dataset and Metrics
Datasets used for testing (e.g., ChartSequences, FactualSumm)
Performance metrics (accuracy, faithfulness)
CHATS-PI Performance
State-of-the-art results achieved
Comparison with previous methods
Limitations and Discussion
Reference-based metrics' shortcomings
Importance of factual correctness in chart summarization
Challenges and future directions
Conclusion
Summary of key findings
Contributions of CHATS-CRITIC and CHATS-PI
Implications for chart summarization research and practice
Basic info
papers
computation and language
artificial intelligence
Advanced features
Insights
What are the two reference-free chart summarization tools discussed in the paper?
What is the role of CHATS-PI in the chart summarization process?
Which dataset(s) were used to assess the effectiveness of CHATS-CRITIC and CHATS-PI?
How does CHATS-CRITIC evaluate the faithfulness of chart summaries?