Unveiling the Power of Source: Source-based Minimum Bayes Risk Decoding for Neural Machine Translation

Boxuan Lyu, Hidetaka Kamigaito, Kotaro Funakoshi, Manabu Okumura·June 17, 2024

Summary

This paper investigates the potential of source-based Minimum Bayes Risk (sMBR) decoding for neural machine translation (NMT), challenging the reliance on maximum a posteriori (MAP) decoding. sMBR uses synthetic sources generated through backward translation and a reference-free quality estimation metric as a utility function, making it more efficient than standard MBR. Experiments show that sMBR outperforms quality estimation reranking and is competitive with standard MBR, particularly in low-resource scenarios, by offering a more comprehensive estimation of hypothesis quality without relying on a single source. The study highlights sMBR's promise for improving translation quality and efficiency, with sMBR-PP, which employs paraphrasing, showing the most significant gains. Future research should focus on enhancing efficiency and exploring broader application of the approach.

Key findings

Advanced features