Category

Academic

Academic · 1 min

FAAR: Format-Aware Adaptive Rounding for NVFP4

arXiv:2603.22370v1 Announce Type: new Abstract: Deploying large language models (LLMs) on edge devices requires extremely low-bit quantization. Ultra-low precision formats such as NVFP4 offer a …

Hanglin Li, Shuchang Tian, Chen Lin, Zhiyong Zhao, Kun Zhan
5 views
Academic · 1 min

Three Creates All: You Only Sample 3 Steps

arXiv:2603.22375v1 Announce Type: new Abstract: Diffusion models deliver high-fidelity generation but remain slow at inference time due to many sequential network evaluations. We find that …

Yuren Cai, Guangyi Wang, Zongqing Li, Li Li, Zhihui Liu, Songzhi Su
10 views