Academic

Academic

Academic · 1 min

MemPO: Self-Memory Policy Optimization for Long-Horizon Agents

arXiv:2603.00680v1 Announce Type: new Abstract: Long-horizon agents face the challenge of growing context size during interaction with environment, which degrades the performance and stability. Existing …

Ruoran Li, Xinghua Zhang, Haiyang Yu, Shitong Duan, Xiang Li, Wenxin Xiang, Chonghua Liao, Xudong Guo, Yongbin Li, Jinli Suo
4 views
Academic · 1 min

Tracking Capabilities for Safer Agents

arXiv:2603.00991v1 Announce Type: new Abstract: AI agents that interact with the real world through tool calls pose fundamental safety challenges: agents might leak private information, …

Martin Odersky, Yaoyu Zhao, Yichen Xu, Oliver Bra\v{c}evac, Cao Nguyen Pham
31 views
Academic · 1 min

CollabEval: Enhancing LLM-as-a-Judge via Multi-Agent Collaboration

arXiv:2603.00993v1 Announce Type: new Abstract: Large Language Models (LLMs) have revolutionized AI-generated content evaluation, with the LLM-as-a-Judge paradigm becoming increasingly popular. However, current single-LLM evaluation …

Yiyue Qian, Shinan Zhang, Yun Zhou, Haibo Ding, Diego Socolinsky, Yi Zhang
4 views