MAVRL: Learning Reward Functions from Multiple Feedback Types with Amortized Variational Inference
arXiv:2602.15206v1 Announce Type: new Abstract: Reward learning typically relies on a single feedback type or combines multiple feedback types using manually weighted loss terms. Currently, …