Academic

Academic

Academic · 1 min

TelcoAgent-Bench: A Multilingual Benchmark for Telecom AI Agents

arXiv:2604.06209v1 Announce Type: new Abstract: The integration of large language model (LLM) agents into telecom networks introduces new challenges, related to intent recognition, tool execution, …

Lina Bariah, Brahim Mefgouda, Farbod Tavakkoli, Enrique Molero, Louis Powell, Merouane Debbah
65 views
Academic · 1 min

The Illusion of Stochasticity in LLMs

arXiv:2604.06543v1 Announce Type: new Abstract: In this work, we demonstrate that reliable stochastic sampling is a fundamental yet unfulfilled requirement for Large Language Models (LLMs) …

Xiangming Gu, Soham De, Michalis Titsias, Larisa Markeeva, Petar Veli\v{c}kovi\'c, Razvan Pascanu
35 views
Academic · 1 min

When to Call an Apple Red: Humans Follow Introspective Rules, VLMs Don't

arXiv:2604.06422v1 Announce Type: new Abstract: Understanding when Vision-Language Models (VLMs) will behave unexpectedly, whether models can reliably predict their own behavior, and if models adhere …

Jonathan Nemitz, Carsten Eickhoff, Junyi Jessy Li, Kyle Mahowald, Michal Golovanevsky, William Rudman
50 views