Academic

Academic

Academic · 1 min

SimpleTool: Parallel Decoding for Real-Time LLM Function Calling

arXiv:2603.00030v1 Announce Type: new Abstract: LLM-based function calling enables intelligent agents to interact with external tools and environments, yet autoregressive decoding imposes a fundamental latency …

Xiaoxin Shi, Jiaxin Wan, Linkang Dong, Wei Jiang, Yue Liu, Zengfeng Huang
4 views
Academic · 1 min

Stepwise Penalization for Length-Efficient Chain-of-Thought Reasoning

arXiv:2603.00296v1 Announce Type: new Abstract: Large reasoning models improve with more test-time computation, but often overthink, producing unnecessarily long chains-of-thought that raise cost without improving …

Xintong Li, Sha Li, Rongmei Lin, Hongye Jin, Linwei Li, Hejie Cui, Sarah Zhang, Chia-Yuan Chang, Kewei Cheng, Besnik Fetahu, Priyanka Nigam, Jingbo Shang, Bing Yin
4 views