FairTranslate: An English-French Dataset for Gender Bias Evaluation in Machine Translation by Overcoming Gender Binarity

Fanny Jourdan, Yannick Chevalier, Cécile Favre·April 22, 2025

Summary

The FairTranslate dataset, evaluated at FAccT '25, scrutinizes gender biases in machine translation from English to French, focusing on non-binary issues. It uncovers biases in four LLMs, with 2418 sentence pairs emphasizing the need for fair language usage. Analysis indicates that increased prompting can reduce gender disparities, particularly for males, but doesn't fully achieve equity. Results align with Gemma2-2B across models.

Introduction
Background
Overview of the FairTranslate dataset
Context of FAccT '25 conference
Objective
To evaluate gender biases in machine translation from English to French, specifically focusing on non-binary issues
Method
Data Collection
Description of the sentence pairs included in the dataset
Selection criteria for sentence pairs emphasizing non-binary issues
Data Preprocessing
Methods used for preparing the dataset for analysis
Evaluation Framework
Criteria for assessing the performance of language models
Comparison with Gemma2-2B across models
Results
Bias Identification
Overview of biases found in four LLMs
Impact of Prompting
Analysis of how increased prompting affects gender disparities
Focus on the reduction of biases for males
Equity Assessment
Discussion on the extent to which increased prompting achieves equity
Conclusion
Implications
Implications of the findings for the field of machine translation
Future Work
Suggestions for further research to address identified biases
Recommendations
Recommendations for improving fairness in machine translation systems
Advanced features