Rethinking Adversarial Attacks in Reinforcement Learning from Policy Distribution Perspective

Tianyang Duan, Zongyuan Zhang, Zheng Lin, Yue Gao, Ling Xiong, Yong Cui, Hongbin Liang, Xianhao Chen, Heming Cui, Dong Huang·January 07, 2025

Summary

DAPGD, a distribution-aware projected gradient descent attack, evaluates Deep Reinforcement Learning (DRL) agents' robustness. It uses the Bhattacharyya distance to measure policy similarity, focusing on the entire distribution rather than individual samples. DAPGD outperforms baselines by achieving an average 22.03% higher reward drop in robot navigation tasks, highlighting DRL's vulnerability to adversarial attacks in real-world applications with noisy and inaccurate observation signals.

Advanced features