MemPO: Self-Memory Policy Optimization for Long-Horizon Agents
arXiv:2603.00680v1 Announce Type: new Abstract: Long-horizon agents face the challenge of growing context size during interaction with environment, which degrades the performance and stability. Existing …