Academic

Academic

Academic · 1 min

Confidence Should Be Calibrated More Than One Turn Deep

arXiv:2604.05397v1 Announce Type: new Abstract: Large Language Models (LLMs) are increasingly applied in high-stakes domains such as finance, healthcare, and education, where reliable multi-turn interactions …

Zhaohan Zhang, Chengzhengxu Li, Xiaoming Liu, Chao Shen, Ziquan Liu, Ioannis Patras
45 views
Academic · 1 min

EpiBench: Benchmarking Multi-turn Research Workflows for Multimodal Agents

arXiv:2604.05557v1 Announce Type: new Abstract: Scientific research follows multi-turn, multi-step workflows that require proactively searching the literature, consulting figures and tables, and integrating evidence across …

Xuan Dong, Huanyang Zheng, Tianhao Niu, Zhe Han, Pengzhan Li, Bofei Liu, Zhengyang Liu, Guancheng Li, Qingfu Zhu, Wanxiang Che
57 views
Academic · 1 min

XMark: Reliable Multi-Bit Watermarking for LLM-Generated Texts

arXiv:2604.05242v1 Announce Type: new Abstract: Multi-bit watermarking has emerged as a promising solution for embedding imperceptible binary messages into Large Language Model (LLM)-generated text, enabling …

Jiahao Xu, Rui Hu, Olivera Kotevska, Zikai Zhang
39 views
Academic · 1 min

DQA: Diagnostic Question Answering for IT Support

arXiv:2604.05350v1 Announce Type: new Abstract: Enterprise IT support interactions are fundamentally diagnostic: effective resolution requires iterative evidence gathering from ambiguous user reports to identify an …

Vishaal Kapoor, Mariam Dundua, Sarthak Ahuja, Neda Kordjazi, Evren Yortucboylu, Vaibhavi Padala, Derek Ho, Jennifer Whitted, Rebecca Steinert
36 views
Academic · 1 min

Attribution Bias in Large Language Models

arXiv:2604.05224v1 Announce Type: new Abstract: As Large Language Models (LLMs) are increasingly used to support search and information retrieval, it is critical that they accurately …

Eliza Berman, Bella Chang, Daniel B. Neill, Emily Black
27 views
Academic · 1 min

AutoSOTA: An End-to-End Automated Research System for State-of-the-Art AI Model Discovery

arXiv:2604.05550v1 Announce Type: new Abstract: Artificial intelligence research increasingly depends on prolonged cycles of reproduction, debugging, and iterative refinement to achieve State-Of-The-Art (SOTA) performance, creating …

Yu Li, Chenyang Shao, Xinyang Liu, Ruotong Zhao, Peijie Liu, Hongyuan Su, Zhibin Chen, Qinglong Yang, Anjie Xu, Yi Fang, Qingbin Zeng, Tianxing Li, Jingbo Xu, Fengli Xu, Yong Li, Tie-Yan Liu
53 views