Academic

Academic · 1 min

Learning When to Trust in Contextual Bandits

arXiv:2603.13356v1 Announce Type: new Abstract: Standard approaches to Robust Reinforcement Learning assume that feedback sources are either globally trustworthy or globally adversarial. In this paper, …

Majid Ghasemi, Mark Crowley

4 views Mar 17

Academic · 1 min

Optimizing LLM Annotation of Classroom Discourse through Multi-Agent Orchestration

arXiv:2603.13353v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly positioned as scalable tools for annotating educational data, including classroom discourse, interaction logs, and …

Bakhtawar Ahtisham, Kirk Vanacore, Rene F. Kizilcec

4 views Mar 17

Academic · 1 min

Prompt Complexity Dilutes Structured Reasoning: A Follow-Up Study on the Car Wash Problem

arXiv:2603.13351v1 Announce Type: new Abstract: In a previous study [Jo, 2026], STAR reasoning (Situation, Task, Action, Result) raised car wash problem accuracy from 0% to …

Heejin Jo

4 views Mar 17

Academic · 1 min

AutoTool: Automatic Scaling of Tool-Use Capabilities in RL via Decoupled Entropy Constraints

arXiv:2603.13348v1 Announce Type: new Abstract: Tool use represents a critical capability for AI agents, with recent advances focusing on leveraging reinforcement learning (RL) to scale …

Yirong Zeng, Xiao Ding, Yufei Liu, Yuxian Wang, Qunyao Du, Yutai Hou, Wu Ning, Haonan Song, Duyu Tang, Dandan Tu, Bing Qin, Ting Liu

11 views Mar 17

Academic · 1 min

DyACE: Dynamic Algorithm Co-evolution for Online Automated Heuristic Design with Large Language Model

arXiv:2603.13344v1 Announce Type: new Abstract: The prevailing paradigm in Automated Heuristic Design (AHD) typically relies on the assumption that a single, fixed algorithm can effectively …

Guidong Lu, Yiping Liu, Xiangxiang Zeng

4 views Mar 17

Academic · 1 min

Why Grokking Takes So Long: A First-Principles Theory of Representational Phase Transitions

arXiv:2603.13331v1 Announce Type: new Abstract: Grokking is the sudden generalization that appears long after a model has perfectly memorized its training data. Although this phenomenon …

Truong Xuan Khanh, Truong Quynh Hoa, Luu Duc Trung, Phan Thanh Duc

4 views Mar 17

Academic · 1 min

DOVA: Deliberation-First Multi-Agent Orchestration for Autonomous Research Automation

arXiv:2603.13327v1 Announce Type: new Abstract: Large language model (LLM) agents have demonstrated remarkable capabilities in tool use, reasoning, and code generation, yet single-agent systems exhibit …

Aaron Shen, Alfred Shen

5 views Mar 17

Academic · 1 min

Agent-Based User-Adaptive Filtering for Categorized Harassing Communication

arXiv:2603.13288v1 Announce Type: new Abstract: We propose an agent-based framework for personalized filtering of categorized harassing communication in online social networks. Unlike global moderation systems …

Zenefa Rahaman, Sandip Sen

11 views Mar 17

Academic · 1 min

Multi-hop Reasoning and Retrieval in Embedding Space: Leveraging Large Language Models with Knowledge

arXiv:2603.13266v1 Announce Type: new Abstract: As large language models (LLMs) continue to grow in size, their abilities to tackle complex tasks have significantly improved. However, …

Lihui Liu

12 views Mar 17

Academic · 1 min

Deep Convolutional Architectures for EEG Classification: A Comparative Study with Temporal Augmentation and Confidence-Based Voting

arXiv:2603.13261v1 Announce Type: new Abstract: Electroencephalography (EEG) classification plays a key role in brain-computer interface (BCI) systems, yet it remains challenging due to the low …

Aryan Patodiya, Hubert Cecotti

15 views Mar 17

Academic · 1 min

Distilling Deep Reinforcement Learning into Interpretable Fuzzy Rules: An Explainable AI Framework

arXiv:2603.13257v1 Announce Type: new Abstract: Deep Reinforcement Learning (DRL) agents achieve remarkable performance in continuous control but remain opaque, hindering deployment in safety-critical domains. Existing …

Sanup S. Araballi, Simon Khan, Chilukuri K. Mohan

4 views Mar 17

Academic · 1 min

When Alpha Breaks: Two-Level Uncertainty for Safe Deployment of Cross-Sectional Stock Rankers

arXiv:2603.13252v1 Announce Type: new Abstract: Cross-sectional ranking models are often deployed as if point predictions were sufficient: the model outputs scores and the portfolio follows …

Ursina Sanderink

8 views Mar 17

Learning When to Trust in Contextual Bandits

Optimizing LLM Annotation of Classroom Discourse through Multi-Agent Orchestration

Prompt Complexity Dilutes Structured Reasoning: A Follow-Up Study on the Car Wash Problem

AutoTool: Automatic Scaling of Tool-Use Capabilities in RL via Decoupled Entropy Constraints

DyACE: Dynamic Algorithm Co-evolution for Online Automated Heuristic Design with Large Language Model

Why Grokking Takes So Long: A First-Principles Theory of Representational Phase Transitions

DOVA: Deliberation-First Multi-Agent Orchestration for Autonomous Research Automation

Agent-Based User-Adaptive Filtering for Categorized Harassing Communication

Multi-hop Reasoning and Retrieval in Embedding Space: Leveraging Large Language Models with Knowledge

Deep Convolutional Architectures for EEG Classification: A Comparative Study with Temporal Augmentation and Confidence-Based Voting

Distilling Deep Reinforcement Learning into Interpretable Fuzzy Rules: An Explainable AI Framework

When Alpha Breaks: Two-Level Uncertainty for Safe Deployment of Cross-Sectional Stock Rankers

JCG, PC

HSOLLC Co., Ltd.