Academic

Latest First Most Viewed Alphabetical

All Conference (266) Law Review (314) Academic (4957) Think Tank (60) News (791) Journal (139) Technology & AI (4) Business & Strategy (1) Finance & Economics (2) Legal & Compliance (1) Innovation & Research (0) International Affairs (2) Cybersecurity (2) Healthcare & Biotech (2)

Academic · 1 min

Explicit Logic Channel for Validation and Enhancement of MLLMs on Zero-Shot Tasks

arXiv:2603.11689v1 Announce Type: new Abstract: Frontier Multimodal Large Language Models (MLLMs) exhibit remarkable capabilities in Visual-Language Comprehension (VLC) tasks. However, they are often deployed as …

Mei Chee Leong, Ying Gu, Hui Li Tan, Liyuan Li, Nancy Chen

35 views Mar 13

Academic · 1 min

DocSage: An Information Structuring Agent for Multi-Doc Multi-Entity Question Answering

arXiv:2603.11798v1 Announce Type: new Abstract: Multi-document Multi-entity Question Answering inherently demands models to track implicit logic between multiple entities across scattered documents. However, existing Large …

Teng Lin, Yizhang Zhu, Zhengxuan Zhang, Yuyu Luo, Nan Tang

42 views Mar 13

Academic · 1 min

Mind the Sim2Real Gap in User Simulation for Agentic Tasks

arXiv:2603.11245v1 Announce Type: new Abstract: As NLP evaluation shifts from static benchmarks to multi-turn interactive settings, LLM-based simulators have become widely used as user proxies, …

Xuhui Zhou, Weiwei Sun, Qianou Ma, Yiqing Xie, Jiarui Liu, Weihua Du, Sean Welleck, Yiming Yang, Graham Neubig, Sherry Tongshuang Wu, Maarten Sap

39 views Mar 13

Academic · 1 min

Reversible Lifelong Model Editing via Semantic Routing-Based LoRA

arXiv:2603.11239v1 Announce Type: new Abstract: The dynamic evolution of real-world necessitates model editing within Large Language Models. While existing methods explore modular isolation or parameter-efficient …

Haihua Luo, Xuming Ran, Tommi K\"arkk\"ainen, Zhonghua Chen, Jiangrong Shen, Qi Xu, Fengyu Cong

29 views Mar 13

Academic · 1 min

RewardHackingAgents: Benchmarking Evaluation Integrity for LLM ML-Engineering Agents

arXiv:2603.11337v1 Announce Type: new Abstract: LLM agents increasingly perform end-to-end ML engineering tasks where success is judged by a single scalar test metric. This creates …

Yonas Atinafu, Robin Cohen

64 views Mar 13

Academic · 1 min

Examining Users' Behavioural Intention to Use OpenClaw Through the Cognition--Affect--Conation Framework

arXiv:2603.11455v1 Announce Type: new Abstract: This study examines users' behavioural intention to use OpenClaw through the Cognition--Affect--Conation (CAC) framework. The research investigates how cognitive perceptions …

Yiran Du

47 views Mar 13

Academic · 1 min

CreativeBench: Benchmarking and Enhancing Machine Creativity via Self-Evolving Challenges

arXiv:2603.11863v1 Announce Type: new Abstract: The saturation of high-quality pre-training data has shifted research focus toward evolutionary systems capable of continuously generating novel artifacts, leading …

Zi-Han Wang, Lam Nguyen, Zhengyang Zhao, Mengyue Yang, Chengwei Qin, Yujiu Yang, Linyi Yang

36 views Mar 13

Academic · 1 min

The Unlearning Mirage: A Dynamic Framework for Evaluating LLM Unlearning

arXiv:2603.11266v1 Announce Type: new Abstract: Unlearning in Large Language Models (LLMs) aims to enhance safety, mitigate biases, and comply with legal mandates, such as the …

Raj Sanjay Shah, Jing Huang, Keerthiram Murugesan, Nathalie Baracaldo, Diyi Yang

43 views Mar 13

Academic · 1 min

STAIRS-Former: Spatio-Temporal Attention with Interleaved Recursive Structure Transformer for Offline Multi-task Multi-agent Reinforcement Learning

arXiv:2603.11691v1 Announce Type: new Abstract: Offline multi-agent reinforcement learning (MARL) with multi-task datasets is challenging due to varying numbers of agents across tasks and the …

Jiwon Jeon, Myungsik Cho, Youngchul Sung

37 views Mar 13

Academic · 1 min

Temporal Text Classification with Large Language Models

arXiv:2603.11295v1 Announce Type: new Abstract: Languages change over time. Computational models can be trained to recognize such changes enabling them to estimate the publication date …

Nishat Raihan, Marcos Zampieri

43 views Mar 13

Academic · 1 min

Expert Threshold Routing for Autoregressive Language Modeling with Dynamic Computation Allocation and Load Balancing

arXiv:2603.11535v1 Announce Type: new Abstract: Token-choice Mixture-of-Experts (TC-MoE) routes each token to a fixed number of experts, limiting dynamic computation allocation and requiring auxiliary losses …

Hanchi Sun, Yixin Liu, Yonghui Wu, Lichao Sun

32 views Mar 13

Academic · 1 min

Deactivating Refusal Triggers: Understanding and Mitigating Overrefusal in Safety Alignment

arXiv:2603.11388v1 Announce Type: new Abstract: Safety alignment aims to ensure that large language models (LLMs) refuse harmful requests by post-training on harmful queries paired with …

Zhiyu Xue, Zimo Qi, Guangliang Liu, Bocheng Chen, Ramtin Pedarsani

60 views Mar 13

← Previous

152 153 154 155 156

Academic

Explicit Logic Channel for Validation and Enhancement of MLLMs on Zero-Shot Tasks

DocSage: An Information Structuring Agent for Multi-Doc Multi-Entity Question Answering

Mind the Sim2Real Gap in User Simulation for Agentic Tasks

Reversible Lifelong Model Editing via Semantic Routing-Based LoRA

RewardHackingAgents: Benchmarking Evaluation Integrity for LLM ML-Engineering Agents

Examining Users' Behavioural Intention to Use OpenClaw Through the Cognition--Affect--Conation Framework

CreativeBench: Benchmarking and Enhancing Machine Creativity via Self-Evolving Challenges

The Unlearning Mirage: A Dynamic Framework for Evaluating LLM Unlearning

STAIRS-Former: Spatio-Temporal Attention with Interleaved Recursive Structure Transformer for Offline Multi-task Multi-agent Reinforcement Learning

Temporal Text Classification with Large Language Models

Expert Threshold Routing for Autoregressive Language Modeling with Dynamic Computation Allocation and Load Balancing

Deactivating Refusal Triggers: Understanding and Mitigating Overrefusal in Safety Alignment

JCG, PC

HSOLLC Co., Ltd.