Leveraging Customer Feedback for Multi-modal Insight Extraction

Sandeep Sricharan Mukku, Abinesh Kanagarajan, Pushpendu Ghosh, Chetan Aggarwal·October 13, 2024

Summary

A novel multi-modal method for extracting actionable insights from customer feedback, combining text and images, is introduced in this paper. It proposes a latent space fusion technique and an image-text grounded text decoder. A weakly-supervised data generation method is also introduced for training. Evaluated on unseen data, the model outperforms baselines by 14 points in F1 score, effectively mining actionable insights from multi-modal customer feedback.

Key findings

7

Advanced features