All Articles

Articles

Academic · 1 min

On Robustness and Chain-of-Thought Consistency of RL-Finetuned VLMs

arXiv:2602.12506v1 Announce Type: new Abstract: Reinforcement learning (RL) fine-tuning has become a key technique for enhancing large language models (LLMs) on reasoning-intensive tasks, motivating its …

Rosie Zhao, Anshul Shah, Xiaoyu Zhu, Xinke Deng, Zhongyu Jiang, Yang Yang, Joerg Liebelt, Arnab Mondal
5 views
Academic · 1 min

AMPS: Adaptive Modality Preference Steering via Functional Entropy

arXiv:2602.12533v1 Announce Type: new Abstract: Multimodal Large Language Models (MLLMs) often exhibit significant modality preference, which is a tendency to favor one modality over another. …

Zihan Huang, Xintong Li, Rohan Surana, Tong Yu, Rui Wang, Julian McAuley, Jingbo Shang, Junda Wu
12 views
Academic · 1 min

Block-Sample MAC-Bayes Generalization Bounds

arXiv:2602.12605v1 Announce Type: new Abstract: We present a family of novel block-sample MAC-Bayes bounds (mean approximately correct). While PAC-Bayes bounds (probably approximately correct) typically give …

Matthias Frey, Jingge Zhu, Michael C. Gastpar
19 views