GASP: Guided Asymmetric Self-Play For Coding LLMs
arXiv:2603.15957v1 Announce Type: new Abstract: Asymmetric self-play has emerged as a promising paradigm for post-training large language models, where a teacher continually generates questions for …
Swadesh Jana, Cansu Sancaktar, Tom\'a\v{s} Dani\v{s}, Georg Martius, Antonio Orvieto, Pavel Kolev
9 views