Adapting Text LLMs to Speech via Multimodal Depth Up-Scaling
arXiv:2604.00489v1 Announce Type: new Abstract: Adapting pre-trained text Large Language Models (LLMs) into Speech Language Models (Speech LMs) via continual pretraining on speech data is …
Kazuki Yano, Jun Suzuki, Shinji Watanabe
2 views