Academic

Latest First Most Viewed Alphabetical

All Conference (266) Law Review (314) Academic (4957) Think Tank (60) News (791) Journal (139) Technology & AI (4) Business & Strategy (1) Finance & Economics (2) Legal & Compliance (1) Innovation & Research (0) International Affairs (2) Cybersecurity (2) Healthcare & Biotech (2)

Academic · 1 min

Robust and Efficient Tool Orchestration via Layered Execution Structures with Reflective Correction

arXiv:2602.18968v1 Announce Type: new Abstract: Tool invocation is a core capability of agentic systems, yet failures often arise not from individual tool calls but from …

Tao Zhe, Haoyu Wang, Bo Luo, Min Wu, Wei Fan, Xiao Luo, Zijun Yao, Haifeng Chen, Dongjie Wang

5 views Mar 7

Academic · 1 min

When Do LLM Preferences Predict Downstream Behavior?

arXiv:2602.18971v1 Announce Type: new Abstract: Preference-driven behavior in LLMs may be a necessary precondition for AI misalignment such as sandbagging: models cannot strategically pursue misaligned …

Katarina Slama, Alexandra Souly, Dishank Bansal, Henry Davidson, Christopher Summerfield, Lennart Luettgau

3 views Mar 7

Academic · 1 min

How Far Can We Go with Pixels Alone? A Pilot Study on Screen-Only Navigation in …

arXiv:2602.18981v1 Announce Type: new Abstract: Modern 3D game levels rely heavily on visual guidance, yet the navigability of level layouts remains difficult to quantify. Prior …

Kaijie Xu, Mustafa Bugti, Clark Verbrugge

4 views Mar 7

Academic · 1 min

InfEngine: A Self-Verifying and Self-Optimizing Intelligent Engine for Infrared Radiation Computing

arXiv:2602.18985v1 Announce Type: new Abstract: Infrared radiation computing underpins advances in climate science, remote sensing and spectroscopy but remains constrained by manual workflows. We introduce …

Kun Ding, Jian Xu, Ying Wang, Peipei Yang, Shiming Xiang

4 views Mar 7

Academic · 1 min

Quantifying Automation Risk in High-Automation AI Systems: A Bayesian Framework for Failure Propagation and Optimal …

arXiv:2602.18986v1 Announce Type: new Abstract: Organizations across finance, healthcare, transportation, content moderation, and critical infrastructure are rapidly deploying highly automated AI systems, yet they lack …

Vishal Srivastava, Tanmay Sah

4 views Mar 7

Academic · 1 min

Benchmark Test-Time Scaling of General LLM Agents

arXiv:2602.18998v1 Announce Type: new Abstract: LLM agents are increasingly expected to function as general-purpose systems capable of resolving open-ended user requests. While existing benchmarks focus …

Xiaochuan Li, Ryan Ming, Pranav Setlur, Abhijay Paladugu, Andy Tang, Hao Kang, Shuai Shao, Rong Jin, Chenyan Xiong

12 views Mar 7

Academic · 1 min

Evaluating Large Language Models on Quantum Mechanics: A Comparative Study Across Diverse Models and Tasks

arXiv:2602.19006v1 Announce Type: new Abstract: We present a systematic evaluation of large language models on quantum mechanics problem-solving. Our study evaluates 15 models from five …

S. K. Rithvik

33 views Mar 7

Academic · 1 min

Agentic Problem Frames: A Systematic Approach to Engineering Reliable Domain Agents

arXiv:2602.19065v1 Announce Type: new Abstract: Large Language Models (LLMs) are evolving into autonomous agents, yet current "frameless" development--relying on ambiguous natural language without engineering blueprints--leads …

Chanjin Park (Seoul National University)

12 views Mar 7

Academic · 1 min

Asking the Right Questions: Improving Reasoning with Generated Stepping Stones

arXiv:2602.19069v1 Announce Type: new Abstract: Recent years have witnessed tremendous progress in enabling LLMs to solve complex reasoning tasks such as math and coding. As …

Hengyuan Hu, Tingchen Fu, Minqi Jiang, Alexander H Miller, Yoram Bachrach, Jakob Nicolaus Foerster

13 views Mar 7

Academic · 1 min

Defining Explainable AI for Requirements Analysis

arXiv:2602.19071v1 Announce Type: new Abstract: Explainable Artificial Intelligence (XAI) has become popular in the last few years. The Artificial Intelligence (AI) community in general, and …

Raymond Sheh, Isaac Monteath

4 views Mar 7

Academic · 1 min

Post-Routing Arithmetic in Llama-3: Last-Token Result Writing and Rotation-Structured Digit Directions

arXiv:2602.19109v1 Announce Type: new Abstract: We study three-digit addition in Meta-Llama-3-8B (base) under a one-token readout to characterize how arithmetic answers are finalized after cross-token …

Yao Yan

3 views Mar 7

Academic · 1 min

K-Search: LLM Kernel Generation via Co-Evolving Intrinsic World Model

arXiv:2602.19128v1 Announce Type: new Abstract: Optimizing GPU kernels is critical for efficient modern machine learning systems yet remains challenging due to the complex interplay of …

Shiyi Cao, Ziming Mao, Joseph E. Gonzalez, Ion Stoica

6 views Mar 7

← Previous

264 265 266 267 268

Academic

Robust and Efficient Tool Orchestration via Layered Execution Structures with Reflective Correction

When Do LLM Preferences Predict Downstream Behavior?

How Far Can We Go with Pixels Alone? A Pilot Study on Screen-Only Navigation in …

InfEngine: A Self-Verifying and Self-Optimizing Intelligent Engine for Infrared Radiation Computing

Quantifying Automation Risk in High-Automation AI Systems: A Bayesian Framework for Failure Propagation and Optimal …

Benchmark Test-Time Scaling of General LLM Agents

Evaluating Large Language Models on Quantum Mechanics: A Comparative Study Across Diverse Models and Tasks

Agentic Problem Frames: A Systematic Approach to Engineering Reliable Domain Agents

Asking the Right Questions: Improving Reasoning with Generated Stepping Stones

Defining Explainable AI for Requirements Analysis

Post-Routing Arithmetic in Llama-3: Last-Token Result Writing and Rotation-Structured Digit Directions

K-Search: LLM Kernel Generation via Co-Evolving Intrinsic World Model

JCG, PC

HSOLLC Co., Ltd.