All Articles

Articles

Academic · 1 min

Thoth: Mid-Training Bridges LLMs to Time Series Understanding

arXiv:2603.01042v1 Announce Type: new Abstract: Large Language Models (LLMs) have demonstrated remarkable success in general-purpose reasoning. However, they still struggle to understand and reason about …

Jiafeng Lin, Yuxuan Wang, Jialong Wu, Huakun Luo, Zhongyi Pei, Jianmin Wang
5 views
Academic · 1 min

How RL Unlocks the Aha Moment in Geometric Interleaved Reasoning

arXiv:2603.01070v1 Announce Type: new Abstract: Solving complex geometric problems inherently requires interleaved reasoning: a tight alternation between constructing diagrams and performing logical deductions. Although recent …

Xiangxiang Zhang, Caijun Jia, Siyuan Li, Dingyu He, Xiya Xiong, Zheng Sun, Honghao He, Yuchen Wu, Bihui Yu, Linzhuang Sun, Cheng Tan, Jingxuan Wei
6 views
Academic · 1 min

CARE: Confounder-Aware Aggregation for Reliable LLM Evaluation

arXiv:2603.00039v1 Announce Type: new Abstract: LLM-as-a-judge ensembles are the standard paradigm for scalable evaluation, but their aggregation mechanisms suffer from a fundamental flaw: they implicitly …

Jitian Zhao, Changho Shin, Tzu-Heng Huang, Satya Sai Srinath Namburi GNVV, Frederic Sala
29 views