Multi-Task Reward Learning from Human Ratings

Mingkang Wu, Devin White, Evelyn Rose, Vernon Lawhern, Nicholas R Waytowich, Yongcan Cao·June 10, 2025

Summary

A novel reinforcement learning method integrates multiple tasks to emulate human decision-making, using human ratings in reward-free environments. It infers a learnable reward function balancing classification and regression models, capturing human decision uncertainty. This approach outperforms existing rating-based RL methods and sometimes surpasses traditional RL, framing reward learning as a multi-task prediction problem. Key advancements include multi-task reward learning, focusing on human preferences and feedback, with techniques for optimization and performance enhancement.

Advanced features