Malliavin Calculus for Counterfactual Gradient Estimation in Adaptive Inverse Reinforcement Learning
arXiv:2604.01345v1 Announce Type: new Abstract: Inverse reinforcement learning (IRL) recovers the loss function of a forward learner from its observed responses adaptive IRL aims to …
Vikram Krishnamurthy, Luke Snow
2 views