MKA: Memory-Keyed Attention for Efficient Long-Context Reasoning
arXiv:2603.20586v1 Announce Type: new Abstract: As long-context language modeling becomes increasingly important, the cost of maintaining and attending to large Key/Value (KV) caches grows rapidly, …
Dong Liu, Yanxuan Yu, Ben Lengerich, Ying Nian Wu
5 views