Academic

Academic · 1 min

Perturbation: A simple and efficient adversarial tracer for representation learning in language models

arXiv:2603.23821v1 Announce Type: new Abstract: Linguistic representation learning in deep neural language models (LMs) has been studied for decades, for both practical and theoretical reasons. …

Joshua Rozner, Cory Shain

21 views Mar 26

Academic · 1 min

PoliticsBench: Benchmarking Political Values in Large Language Models with Multi-Turn Roleplay

arXiv:2603.23841v1 Announce Type: new Abstract: While Large Language Models (LLMs) are increasingly used as primary sources of information, their potential for political bias may impact …

Rohan Khetan, Ashna Khetan

6 views Mar 26

Academic · 1 min

Language Model Planners do not Scale, but do Formalizers?

arXiv:2603.23844v1 Announce Type: new Abstract: Recent work shows overwhelming evidence that LLMs, even those trained to scale their reasoning trace, perform unsatisfactorily when solving planning …

Owen Jiang, Cassie Huang, Ashish Sabharwal, Li Zhang

8 views Mar 26

Academic · 1 min

BeliefShift: Benchmarking Temporal Belief Consistency and Opinion Drift in LLM Agents

arXiv:2603.23848v1 Announce Type: new Abstract: LLMs are increasingly used as long-running conversational agents, yet every major benchmark evaluating their memory treats user information as static …

Praveen Kumar Myakala, Manan Agrawal, Rahul Manche

27 views Mar 26

Academic · 1 min

Self-Distillation for Multi-Token Prediction

arXiv:2603.23911v1 Announce Type: new Abstract: As Large Language Models (LLMs) scale up, inference efficiency becomes a critical bottleneck. Multi-Token Prediction (MTP) could accelerate LLM inference …

Guoliang Zhao, Ruobing Xie, An Wang, Shuaipeng Li, Huaibing Xie, Xingwu Sun

11 views Mar 26

Academic · 1 min

Dialogue to Question Generation for Evidence-based Medical Guideline Agent Development

arXiv:2603.23937v1 Announce Type: new Abstract: Evidence-based medicine (EBM) is central to high-quality care, but remains difficult to implement in fast-paced primary care settings. Physicians face …

Zongliang Ji, Ziyang Zhang, Xincheng Tan, Matthew Thompson, Anna Goldenberg, Carl Yang, Rahul G. Krishnan, Fan Zhang

14 views Mar 26

Academic · 1 min

OmniACBench: A Benchmark for Evaluating Context-Grounded Acoustic Control in Omni-Modal Models

arXiv:2603.23938v1 Announce Type: new Abstract: Most testbeds for omni-modal models assess multimodal understanding via textual outputs, leaving it unclear whether these models can properly speak …

Seunghee Kim, Bumkyu Park, Kyudan Jung, Joosung Lee, Soyoon Kim, Jeonghoon Kim, Taeuk Kim, Hwiyeol Jo

7 views Mar 26

Academic · 1 min

Argument Mining as a Text-to-Text Generation Task

arXiv:2603.23949v1 Announce Type: new Abstract: Argument Mining(AM) aims to uncover the argumentative structures within a text. Previous methods require several subtasks, such as span identification, …

Masayuki Kawarada, Tsutomu Hirao, Wataru Uchida, Masaaki Nagata

8 views Mar 26

Academic · 1 min

From AI Assistant to AI Scientist: Autonomous Discovery of LLM-RL Algorithms with LLM Agents

arXiv:2603.23951v1 Announce Type: new Abstract: Discovering improved policy optimization algorithms for language models remains a costly manual process requiring repeated mechanism-level modification and validation. Unlike …

Sirui Xia, Yikai Zhang, Aili Chen, Siye Wu, Siyu Yuan, Yanghua Xiao

6 views Mar 26

Academic · 1 min

The Price Reversal Phenomenon: When Cheaper Reasoning Models End Up Costing More

arXiv:2603.23971v1 Announce Type: new Abstract: Developers and consumers increasingly choose reasoning language models (RLMs) based on their listed API prices. However, how accurately do these …

Lingjiao Chen, Chi Zhang, Yeye He, Ion Stoica, Matei Zaharia, James Zou

25 views Mar 26

Academic · 1 min

Grounding Arabic LLMs in the Doha Historical Dictionary: Retrieval-Augmented Understanding of Quran and Hadith

arXiv:2603.23972v1 Announce Type: new Abstract: Large language models (LLMs) have achieved remarkable progress in many language tasks, yet they continue to struggle with complex historical …

Somaya Eltanbouly, Samer Rashwani

20 views Mar 26

Academic · 1 min

CoCR-RAG: Enhancing Retrieval-Augmented Generation in Web Q&A via Concept-oriented Context Reconstruction

arXiv:2603.23989v1 Announce Type: new Abstract: Retrieval-augmented generation (RAG) has shown promising results in enhancing Q&A by incorporating information from the web and other external sources. …

Kaize Shi, Xueyao Sun, Qika Lin, Firoj Alam, Qing Li, Xiaohui Tao, Guandong Xu

5 views Mar 26

Perturbation: A simple and efficient adversarial tracer for representation learning in language models

PoliticsBench: Benchmarking Political Values in Large Language Models with Multi-Turn Roleplay

Language Model Planners do not Scale, but do Formalizers?

BeliefShift: Benchmarking Temporal Belief Consistency and Opinion Drift in LLM Agents

Self-Distillation for Multi-Token Prediction

Dialogue to Question Generation for Evidence-based Medical Guideline Agent Development

OmniACBench: A Benchmark for Evaluating Context-Grounded Acoustic Control in Omni-Modal Models

Argument Mining as a Text-to-Text Generation Task

From AI Assistant to AI Scientist: Autonomous Discovery of LLM-RL Algorithms with LLM Agents

The Price Reversal Phenomenon: When Cheaper Reasoning Models End Up Costing More

Grounding Arabic LLMs in the Doha Historical Dictionary: Retrieval-Augmented Understanding of Quran and Hadith

CoCR-RAG: Enhancing Retrieval-Augmented Generation in Web Q&A via Concept-oriented Context Reconstruction

JCG, PC

HSOLLC Co., Ltd.