Academic

Academic

Academic · 1 min

Measuring AI Agents' Progress on Multi-Step Cyber Attack Scenarios

arXiv:2603.11214v1 Announce Type: new Abstract: We evaluate the autonomous cyber-attack capabilities of frontier AI models on two purpose-built cyber ranges-a 32-step corporate network attack and …

Linus Folkerts, Will Payne, Simon Inman, Philippos Giavridis, Joe Skinner, Sam Deverett, James Aung, Ekin Zorer, Michael Schmatz, Mahmoud Ghanem, John Wilkinson, Alan Steer, Vy Hong, Jessica Wang
8 views
Academic · 1 min

Markovian Generation Chains in Large Language Models

arXiv:2603.11228v1 Announce Type: new Abstract: The widespread use of large language models (LLMs) raises an important question: how do texts evolve when they are repeatedly …

Mingmeng Geng, Amr Mohamed, Guokan Shang, Michalis Vazirgiannis, Thierry Poibeau
17 views