All Articles

Articles

Academic · 1 min

AE-LLM: Adaptive Efficiency Optimization for Large Language Models

arXiv:2603.20492v1 Announce Type: new Abstract: Large Language Models (LLMs) have achieved remarkable success across diverse applications, yet their deployment remains challenging due to substantial computational …

Kaito Tanaka, Masato Ito, Yuji Nishimura, Keisuke Matsuda, Aya Nakayama
8 views
Academic · 1 min

Delightful Distributed Policy Gradient

arXiv:2603.20521v1 Announce Type: new Abstract: Distributed reinforcement learning trains on data from stale, buggy, or mismatched actors, producing actions with high surprisal (negative log-probability) under …

Ian Osband
24 views
Academic · 1 min

Does This Gradient Spark Joy?

arXiv:2603.20526v1 Announce Type: new Abstract: Policy gradient computes a backward pass for every sample, even though the backward pass is expensive and most samples carry …

Ian Osband
15 views