This platform requires JavaScript for full functionality. Please enable JavaScript in your browser settings.

Quality follows upgrading

Kazuki Yano, Shun Kiyono, Sosuke Kobayashi, Sho Takase, Jun Suzuki

Articles by Kazuki Yano, Shun Kiyono, Sosuke Kobayashi, Sho Takase, Jun Suzuki

Academic · 1 min

Pre-training LLM without Learning Rate Decay Enhances Supervised Fine-Tuning

arXiv:2603.16127v1 Announce Type: new Abstract: We investigate the role of learning rate scheduling in the large-scale pre-training of large language models, focusing on its influence …

2 views Mar 18

Kazuki Yano, Shun Kiyono, Sosuke Kobayashi, Sho Takase, Jun Suzuki

Articles by Kazuki Yano, Shun Kiyono, Sosuke Kobayashi, Sho Takase, Jun Suzuki

Pre-training LLM without Learning Rate Decay Enhances Supervised Fine-Tuning

JCG, PC

HSOLLC Co., Ltd.