Academic
Academic
A Dual-Helix Governance Approach Towards Reliable Agentic AI for WebGIS Development
arXiv:2603.04390v1 Announce Type: new Abstract: WebGIS development requires rigor, yet agentic AI frequently fails due to five large language model (LLM) limitations: context constraints, cross-session …
AriadneMem: Threading the Maze of Lifelong Memory for LLM Agents
arXiv:2603.03290v1 Announce Type: cross Abstract: Long-horizon LLM agents require memory systems that remain accurate under fixed context budgets. However, existing systems struggle with two persistent …
One Bias After Another: Mechanistic Reward Shaping and Persistent Biases in Language Reward Models
arXiv:2603.03291v1 Announce Type: cross Abstract: Reward Models (RMs) are crucial for online alignment of language models (LMs) with human preferences. However, RM-based preference-tuning is vulnerable …
From Conflict to Consensus: Boosting Medical Reasoning via Multi-Round Agentic RAG
arXiv:2603.03292v1 Announce Type: cross Abstract: Large Language Models (LLMs) exhibit high reasoning capacity in medical question-answering, but their tendency to produce hallucinations and outdated knowledge …
Fine-Tuning and Evaluating Conversational AI for Agricultural Advisory
arXiv:2603.03294v1 Announce Type: cross Abstract: Large Language Models show promise for agricultural advisory, yet vanilla models exhibit unsupported recommendations, generic advice lacking specific, actionable detail, …
Language Model Goal Selection Differs from Humans' in an Open-Ended Task
arXiv:2603.03295v1 Announce Type: cross Abstract: As large language models (LLMs) get integrated into human decision-making, they are increasingly choosing goals autonomously rather than only completing …
PlugMem: A Task-Agnostic Plugin Memory Module for LLM Agents
arXiv:2603.03296v1 Announce Type: cross Abstract: Long-term memory is essential for large language model (LLM) agents operating in complex environments, yet existing memory designs are either …
TTSR: Test-Time Self-Reflection for Continual Reasoning Improvement
arXiv:2603.03297v1 Announce Type: cross Abstract: Test-time Training enables model adaptation using only test questions and offers a promising paradigm for improving the reasoning ability of …
TATRA: Training-Free Instance-Adaptive Prompting Through Rephrasing and Aggregation
arXiv:2603.03298v1 Announce Type: cross Abstract: Large Language Models (LLMs) have improved substantially alignment, yet their behavior remains highly sensitive to prompt phrasing. This brittleness has …
From Exact Hits to Close Enough: Semantic Caching for LLM Embeddings
arXiv:2603.03301v1 Announce Type: cross Abstract: The rapid adoption of large language models (LLMs) has created demand for faster responses and lower costs. Semantic caching, reusing …
Developing an AI Assistant for Knowledge Management and Workforce Training in State DOTs
arXiv:2603.03302v1 Announce Type: cross Abstract: Effective knowledge management is critical for preserving institutional expertise and improving the efficiency of workforce training in state transportation agencies. …