All Articles

Articles

Academic · 1 min

CAMEL: Confidence-Gated Reflection for Reward Modeling

arXiv:2602.20670v1 Announce Type: new Abstract: Reward models play a fundamental role in aligning large language models with human preferences. Existing methods predominantly follow two paradigms: …

Zirui Zhu, Hailun Xu, Yang Luo, Yong Liu, Kanchan Sarkar, Kun Xu, Yang You
62 views
Academic · 1 min

Exa-PSD: a new Persian sentiment analysis dataset on Twitter

arXiv:2602.20892v1 Announce Type: new Abstract: Today, Social networks such as Twitter are the most widely used platforms for communication of people. Analyzing this data has …

Seyed Himan Ghaderi, Saeed Sarbazi Azad, Mohammad Mehdi Jaziriyan, Ahmad Akbari
28 views
Academic · 1 min

The Art of Efficient Reasoning: Data, Reward, and Optimization

arXiv:2602.20945v1 Announce Type: new Abstract: Large Language Models (LLMs) consistently benefit from scaled Chain-of-Thought (CoT) reasoning, but also suffer from heavy computational overhead. To address …

Taiqiang Wu, Zenan Zu, Bo Zhou, Ngai Wong
39 views