Multi-Task Reward Learning from Human Ratings
Mingkang Wu, Devin White, Evelyn Rose, Vernon Lawhern, Nicholas R Waytowich, Yongcan Cao·June 10, 2025
Summary
A novel reinforcement learning method integrates multiple tasks to emulate human decision-making, using human ratings in reward-free environments. It infers a learnable reward function balancing classification and regression models, capturing human decision uncertainty. This approach outperforms existing rating-based RL methods and sometimes surpasses traditional RL, framing reward learning as a multi-task prediction problem. Key advancements include multi-task reward learning, focusing on human preferences and feedback, with techniques for optimization and performance enhancement.
Advanced features