All Articles

Articles

Academic · 1 min

GASP: Guided Asymmetric Self-Play For Coding LLMs

arXiv:2603.15957v1 Announce Type: new Abstract: Asymmetric self-play has emerged as a promising paradigm for post-training large language models, where a teacher continually generates questions for …

Swadesh Jana, Cansu Sancaktar, Tom\'a\v{s} Dani\v{s}, Georg Martius, Antonio Orvieto, Pavel Kolev
9 views
Academic · 1 min

Deriving Hyperparameter Scaling Laws via Modern Optimization Theory

arXiv:2603.15958v1 Announce Type: new Abstract: Hyperparameter transfer has become an important component of modern large-scale training recipes. Existing methods, such as muP, primarily focus on …

Egor Shulgin, Dimitri von R\"utte, Tianyue H. Zhang, Niccol\`o Ajroldi, Bernhard Sch\"olkopf, Antonio Orvieto
9 views
Academic · 1 min

W2T: LoRA Weights Already Know What They Can Do

arXiv:2603.15990v1 Announce Type: new Abstract: Each LoRA checkpoint compactly stores task-specific updates in low-rank weight matrices, offering an efficient way to adapt large language models …

Xiaolong Han, Ferrante Neri, Zijian Jiang, Fang Wu, Yanfang Ye, Lu Yin, Zehong Wang
8 views
Academic · 1 min

The Importance of Being Smoothly Calibrated

arXiv:2603.16015v1 Announce Type: new Abstract: Recent work has highlighted the centrality of smooth calibration [Kakade and Foster, 2008] as a robust measure of calibration error. …

Parikshit Gopalan, Konstantinos Stavropoulos, Kunal Talwar, Pranay Tankala
33 views