Tag: math.OC

#math.OC

Academic · 1 min

Residuals-based Offline Reinforcement Learning

arXiv:2604.01378v1 Announce Type: new Abstract: Offline reinforcement learning (RL) has received increasing attention for learning policies from previously collected data without interaction with the real …

Qing Zhu, Xian Yu
4 views
Academic · 1 min

Delightful Distributed Policy Gradient

arXiv:2603.20521v1 Announce Type: new Abstract: Distributed reinforcement learning trains on data from stale, buggy, or mismatched actors, producing actions with high surprisal (negative log-probability) under …

Ian Osband
12 views