Meta-Reinforcement Learning with Self-Reflection for Agentic Search
arXiv:2603.11327v1 Announce Type: new Abstract: This paper introduces MR-Search, an in-context meta reinforcement learning (RL) formulation for agentic search with self-reflection. Instead of optimizing a …
Teng Xiao, Yige Yuan, Hamish Ivison, Huaisheng Zhu, Faeze Brahman, Nathan Lambert, Pradeep Dasigi, Noah A. Smith, Hannaneh Hajishirzi
3 views