Expressivity-Efficiency Tradeoffs for Hybrid Sequence Models
arXiv:2603.08859v1 Announce Type: new Abstract: Hybrid sequence models--combining Transformer and state-space model layers--seek to gain the expressive versatility of attention as well as the computational …