What Capable Agents Must Know: Selection Theorems for Robust Decision-Making under Uncertainty
arXiv:2603.02491v1 Announce Type: new Abstract: As artificial agents become increasingly capable, what internal structure is *necessary* for an agent to act competently under uncertainty? Classical results show that optimal control can be *implemented* using belief states or world models, but not that such representations are required. We prove quantitative "selection theorems" showing that low *average-case regret* on structured families of action-conditioned prediction tasks forces an agent to implement a predictive, structured internal state. Our results cover stochastic policies, partial observability, and evaluation under task distributions, without assuming optimality, determinism, or access to an explicit model. Technically, we reduce predictive modeling to binary "betting" decisions and show that regret bounds limit probability mass on suboptimal bets, enforcing the predictive distinctions needed to separate high-margin outcomes. In fully observed settings, this yields approxi
arXiv:2603.02491v1 Announce Type: new Abstract: As artificial agents become increasingly capable, what internal structure is necessary for an agent to act competently under uncertainty? Classical results show that optimal control can be implemented using belief states or world models, but not that such representations are required. We prove quantitative "selection theorems" showing that low average-case regret on structured families of action-conditioned prediction tasks forces an agent to implement a predictive, structured internal state. Our results cover stochastic policies, partial observability, and evaluation under task distributions, without assuming optimality, determinism, or access to an explicit model. Technically, we reduce predictive modeling to binary "betting" decisions and show that regret bounds limit probability mass on suboptimal bets, enforcing the predictive distinctions needed to separate high-margin outcomes. In fully observed settings, this yields approximate recovery of the interventional transition kernel; under partial observability, it implies necessity of belief-like memory and predictive state, addressing an open question in prior world-model recovery work.
Executive Summary
This article presents a series of 'selection theorems' that demonstrate the necessity of a predictive, structured internal state for capable agents to act competently under uncertainty. Through a novel reduction of predictive modeling to binary 'betting' decisions, the authors show that regret bounds enforce the predictive distinctions needed to separate high-margin outcomes. The results have implications for both theory and practice, covering stochastic policies, partial observability, and evaluation under task distributions. While the work advances our understanding of the internal structure required for robust decision-making, it also highlights the importance of considering uncertainty in the design of artificial agents.
Key Points
- ▸ The article presents a series of 'selection theorems' that demonstrate the necessity of a predictive, structured internal state for capable agents to act competently under uncertainty.
- ▸ The authors use a novel reduction of predictive modeling to binary 'betting' decisions to show that regret bounds enforce the predictive distinctions needed to separate high-margin outcomes.
- ▸ The results cover stochastic policies, partial observability, and evaluation under task distributions, without assuming optimality, determinism, or access to an explicit model.
Merits
Advances Our Understanding of Internal Structure
The article presents a significant contribution to the field by demonstrating the necessity of a predictive, structured internal state for capable agents to act competently under uncertainty.
Methodological Innovation
The authors' novel reduction of predictive modeling to binary 'betting' decisions is a methodological innovation that enables the derivation of regret bounds and the enforcement of predictive distinctions.
Generality and Scope
The results have a broad scope, covering stochastic policies, partial observability, and evaluation under task distributions, without assuming optimality, determinism, or access to an explicit model.
Demerits
Assumes Low Average-Case Regret
The article's results rely on the assumption of low average-case regret, which may not hold in all scenarios, potentially limiting the applicability of the selection theorems.
Does Not Address Human Decision-Making
The article's focus on artificial agents may overlook the importance of understanding human decision-making under uncertainty, which is a critical aspect of many real-world applications.
Expert Commentary
This article represents a significant contribution to the field of artificial intelligence and decision-making under uncertainty. The authors' novel reduction of predictive modeling to binary 'betting' decisions is a methodological innovation that enables the derivation of regret bounds and the enforcement of predictive distinctions. While the article's results rely on the assumption of low average-case regret, the implications of the selection theorems are far-reaching and have significant potential applications in both theory and practice. As the field continues to evolve, it is essential to consider the internal structure of artificial agents and the importance of uncertainty in decision-making.
Recommendations
- ✓ Future research should aim to generalize the selection theorems to scenarios with higher average-case regret or to develop alternative methods for deriving regret bounds.
- ✓ The article's findings should be applied to real-world applications such as autonomous vehicles or healthcare to demonstrate the practical implications of the selection theorems.