Academic

Academic

Academic · 1 min

GT-HarmBench: Benchmarking AI Safety Risks Through the Lens of Game Theory

arXiv:2602.12316v1 Announce Type: new Abstract: Frontier AI systems are increasingly capable and deployed in high-stakes multi-agent environments. However, existing AI safety benchmarks largely evaluate single …

Pepijn Cobben, Xuanqiang Angelo Huang, Thao Amelia Pham, Isabel Dahlgren, Terry Jingchen Zhang, Zhijing Jin
38 views
Academic · 1 min

AI Agents for Inventory Control: Human-LLM-OR Complementarity

arXiv:2602.12631v1 Announce Type: new Abstract: Inventory control is a fundamental operations problem in which ordering decisions are traditionally guided by theoretically grounded operations research (OR) …

Jackie Baek, Yaopeng Fu, Will Ma, Tianyi Peng
80 views
Academic · 1 min

Think Fast and Slow: Step-Level Cognitive Depth Adaptation for LLM Agents

arXiv:2602.12662v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly deployed as autonomous agents for multi-turn decision-making tasks. However, current agents typically rely on …

Ruihan Yang, Fanghua Ye, Xiang We, Ruoqing Zhao, Kang Luo, Xinbo Xu, Bo Zhao, Ruotian Ma, Shanyi Wang, Zhaopeng Tu, Xiaolong Li, Deqing Yang, Linus
11 views