Academic

Academic

Academic · 1 min

ReportLogic: Evaluating Logical Quality in Deep Research Reports

arXiv:2602.18446v1 Announce Type: new Abstract: Users increasingly rely on Large Language Models (LLMs) for Deep Research, using them to synthesize diverse sources into structured reports …

Jujia Zhao, Zhaoxin Huan, Zihan Wang, Xiaolu Zhang, Jun Zhou, Suzan Verberne, Zhaochun Ren
6 views
Academic · 1 min

Prompt Optimization Via Diffusion Language Models

arXiv:2602.18449v1 Announce Type: new Abstract: We propose a diffusion-based framework for prompt optimization that leverages Diffusion Language Models (DLMs) to iteratively refine system prompts through …

Shiyu Wang, Haolin Chen, Liangwei Yang, Jielin Qiu, Rithesh Murthy, Ming Zhu, Zixiang Chen, Silvio Savarese, Caiming Xiong, Shelby Heinecke, Huan Wang
3 views
Academic · 1 min

Asymptotic Semantic Collapse in Hierarchical Optimization

arXiv:2602.18450v1 Announce Type: new Abstract: Multi-agent language systems can exhibit a failure mode where a shared dominant context progressively absorbs individual semantics, yielding near-uniform behavior …

Faruk Alpay, Bugra Kilictas
17 views
Academic · 1 min

The Million-Label NER: Breaking Scale Barriers with GLiNER bi-encoder

arXiv:2602.18487v1 Announce Type: new Abstract: This paper introduces GLiNER-bi-Encoder, a novel architecture for Named Entity Recognition (NER) that harmonizes zero-shot flexibility with industrial-scale efficiency. While …

Ihor Stepanov, Mykhailo Shtopko, Dmytro Vodianytskyi, Oleksandr Lukashov
4 views
Academic · 1 min

Luna-2: Scalable Single-Token Evaluation with Small Language Models

arXiv:2602.18583v1 Announce Type: new Abstract: Real-time guardrails require evaluation that is accurate, cheap, and fast - yet today's default, LLM-as-a-judge (LLMAJ), is slow, expensive, and …

Vatsal Goel, Rishon Dsouza, Nikhil Ega, Amey Ramesh Rambatla, Rob Friel, Shuai Shao, Yash Sheth
3 views