Adaptive Theory of Mind for LLM-based Multi-Agent Coordination
arXiv:2603.16264v1 Announce Type: new Abstract: Theory of Mind (ToM) refers to the ability to reason about others' mental states, and higher-order ToM involves considering that …
Quality follows upgrading
Academic
arXiv:2603.16264v1 Announce Type: new Abstract: Theory of Mind (ToM) refers to the ability to reason about others' mental states, and higher-order ToM involves considering that …
arXiv:2603.15976v1 Announce Type: new Abstract: While large language models have significantly accelerated scientific code generation, comprehensively evaluating the generated code remains a major challenge. Traditional …
arXiv:2603.15960v1 Announce Type: new Abstract: The COVID-19 pandemic has placed immense strain on hospital systems worldwide, leading to critical capacity challenges. This research proposes a …
arXiv:2603.15667v1 Announce Type: new Abstract: Real-world phenomena often exhibit vagueness, partial truth, and incomplete information. To model such uncertainty in a mathematically rigorous way, many …
arXiv:2603.15653v1 Announce Type: new Abstract: Long-context handling remains a core challenge for language models: even with extended context windows, models often fail to reliably extract, …
arXiv:2603.15973v1 Announce Type: new Abstract: This paper contains the first formal proof that safety is non-compositional in the presence of conjunctive capability dependencies: two agents …
arXiv:2603.15857v1 Announce Type: new Abstract: Behavioral Foundation Models (BFMs) produce agents with the capability to adapt to any unknown reward or task. These methods, however, …
arXiv:2603.15633v1 Announce Type: new Abstract: Answering complex first-order logic (FOL) queries on knowledge graphs is essential for reasoning. Symbolic methods offer interpretability but struggle with …
arXiv:2603.15909v1 Announce Type: new Abstract: This Monte Carlo simulation examines how prompt engineering strategies shape the quality of large language model (LLM)--generated personality assessment items …
arXiv:2603.15936v1 Announce Type: new Abstract: ClinicalTrials.gov (CT.gov) is the largest publicly accessible registry of clinical studies, yet its registry-oriented architecture and heterogeneous adverse event (AE) …
arXiv:2603.15994v1 Announce Type: new Abstract: Retrieval-augmented generation stores all content indiscriminately, degrading accuracy as noise accumulates. Parametric approaches compress knowledge into weights, precluding selective updates. …
arXiv:2603.15888v1 Announce Type: new Abstract: With AsgardBench we aim to evaluate visually grounded, high-level action sequence generation and interactive planning, focusing specifically on plan adaptation …