All Articles

Articles

Academic · 1 min

Talk Freely, Execute Strictly: Schema-Gated Agentic AI for Flexible and Reproducible Scientific Workflows

arXiv:2603.06394v1 Announce Type: new Abstract: Large language models (LLMs) can now translate a researcher's plain-language goal into executable computation, yet scientific workflows demand determinism, provenance, …

Joel Strickland, Arjun Vijeta, Chris Moores, Oliwia Bodek, Bogdan Nenchev, Thomas Whitehead, Charles Phillips, Karl Tassenberg, Gareth Conduit, Ben Pellegrini
24 views
Academic · 1 min

DeepFact: Co-Evolving Benchmarks and Agents for Deep Research Factuality

arXiv:2603.05912v1 Announce Type: new Abstract: Search-augmented LLM agents can produce deep research reports (DRRs), but verifying claim-level factuality remains challenging. Existing fact-checkers are primarily designed …

Yukun Huang, Leonardo F. R. Ribeiro, Momchil Hardalov, Bhuwan Dhingra, Markus Dreyer, Venkatesh Saligrama
13 views
Academic · 1 min

On the Value of Tokeniser Pretraining in Physics Foundation Models

arXiv:2603.05598v1 Announce Type: cross Abstract: We investigate the impact of tokeniser pretraining on the accuracy and efficiency of physics emulation. Modern high-resolution simulations produce vast …

Hadi Sotoudeh, Payel Mukhopadhyay, Ruben Ohana, Michael McCabe, Neil D. Lawrence, Shirley Ho, Miles Cranmer
52 views
Academic · 1 min

Reasoning Models Struggle to Control their Chains of Thought

arXiv:2603.05706v1 Announce Type: new Abstract: Chain-of-thought (CoT) monitoring is a promising tool for detecting misbehaviors and understanding the motivations of modern reasoning models. However, if …

Chen Yueh-Han, Robert McCarthy, Bruce W. Lee, He He, Ian Kivlichan, Bowen Baker, Micah Carroll, Tomek Korbak
12 views
Academic · 1 min

RACAS: Controlling Diverse Robots With a Single Agentic System

arXiv:2603.05621v1 Announce Type: cross Abstract: Many robotic platforms expose an API through which external software can command their actuators and read their sensors. However, transitioning …

Dylan R. Ashley, Jan Przepi\'ora, Yimeng Chen, Ali Abualsaud, Nurzhan Yesmagambet, Shinkyu Park, Eric Feron, J\"urgen Schmidhuber
28 views