RUQuant: Towards Refining Uniform Quantization for Large Language Models
arXiv:2604.04013v1 Announce Type: new Abstract: The increasing size and complexity of large language models (LLMs) have raised significant challenges in deployment efficiency, particularly under resource …