Academic

Latest First Most Viewed Alphabetical

All Conference (266) Law Review (314) Academic (4957) Think Tank (60) News (791) Journal (139) Technology & AI (4) Business & Strategy (1) Finance & Economics (2) Legal & Compliance (1) Innovation & Research (0) International Affairs (2) Cybersecurity (2) Healthcare & Biotech (2)

Academic · 1 min

EviAgent: Evidence-Driven Agent for Radiology Report Generation

arXiv:2603.13956v1 Announce Type: new Abstract: Automated radiology report generation holds immense potential to alleviate the heavy workload of radiologists. Despite the formidable vision-language capabilities of …

Tuoshi Qi, Shenshen Bu, Yingfei Xiang, Zhiming Dai

18 views Mar 17

Academic · 1 min

Optimizing LLM Annotation of Classroom Discourse through Multi-Agent Orchestration

arXiv:2603.13353v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly positioned as scalable tools for annotating educational data, including classroom discourse, interaction logs, and …

Bakhtawar Ahtisham, Kirk Vanacore, Rene F. Kizilcec

20 views Mar 17

Academic · 1 min

Generate Then Correct: Single Shot Global Correction for Aspect Sentiment Quad Prediction

arXiv:2603.13777v1 Announce Type: new Abstract: Aspect-based sentiment analysis (ABSA) extracts aspect-level sentiment signals from user-generated text, supports product analytics, experience monitoring, and public-opinion tracking, and …

Shidong He, Haoyu Wang, Wenjie Luo

58 views Mar 17

Academic · 1 min

From Refusal Tokens to Refusal Control: Discovering and Steering Category-Specific Refusal Directions

arXiv:2603.13359v1 Announce Type: new Abstract: Language models are commonly fine-tuned for safety alignment to refuse harmful prompts. One approach fine-tunes them to generate categorical refusal …

Rishab Alagharu, Ishneet Sukhvinder Singh, Shaibi Shamsudeen, Zhen Wu, Ashwinee Panda

21 views Mar 17

Academic · 1 min

QuarkMedBench: A Real-World Scenario Driven Benchmark for Evaluating Large Language Models

arXiv:2603.13691v1 Announce Type: new Abstract: While Large Language Models (LLMs) excel on standardized medical exams, high scores often fail to translate to high-quality responses for …

Yao Wu, Kangping Yin, Liang Dong, Zhenxin Ma, Shuting Xu, Xuehai Wang, Yuxuan Jiang, Tingting Yu, Yunqing Hong, Jiayi Liu, Rianzhe Huang, Shuxin Zhao, Haiping Hu, Wen Shang, Jian Xu, Guanjun Jiang

35 views Mar 17

Academic · 1 min

Intelligent Materials Modelling: Large Language Models Versus Partial Least Squares Regression for Predicting Polysulfone Membrane …

arXiv:2603.13834v1 Announce Type: new Abstract: Predicting the mechanical properties of polysulfone (PSF) membranes from structural descriptors remains challenging due to extreme data scarcity typical of …

Dingding Cao, Mieow Kee Chan, Wan Sieng Yeo, Said Bey, Alberto Figoli

18 views Mar 17

Academic · 1 min

vla-eval: A Unified Evaluation Harness for Vision-Language-Action Models

arXiv:2603.13966v1 Announce Type: new Abstract: Vision Language Action VLA models are typically evaluated using per benchmark scripts maintained independently by each model repository, leading to …

Suhwan Choi, Yunsung Lee, Yubeen Park, Chris Dongjoo Kim, Ranjay Krishna, Dieter Fox, Youngjae Yu

21 views Mar 17

Academic · 1 min

Benchmarking Large Language Models on Reference Extraction and Parsing in the Social Sciences and Humanities

arXiv:2603.13651v1 Announce Type: new Abstract: Bibliographic reference extraction and parsing are foundational for citation indexing, linking, and downstream scholarly knowledge-graph construction. However, most established evaluations …

Yurui Zhu, Giovanni Colavizza, Matteo Romanello

39 views Mar 17

Academic · 1 min

DyACE: Dynamic Algorithm Co-evolution for Online Automated Heuristic Design with Large Language Model

arXiv:2603.13344v1 Announce Type: new Abstract: The prevailing paradigm in Automated Heuristic Design (AHD) typically relies on the assumption that a single, fixed algorithm can effectively …

Guidong Lu, Yiping Liu, Xiangxiang Zeng

25 views Mar 17

Academic · 1 min

Widespread Gender and Pronoun Bias in Moral Judgments Across LLMs

arXiv:2603.13636v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly used to assess moral or ethical statements, yet their judgments may reflect social and …

Gustavo L\'ucius Fernandes, Jeiverson C. V. M. Santos, Pedro O. S. Vaz-de-Melo

38 views Mar 17

Academic · 1 min

A Systematic Evaluation Protocol of Graph-Derived Signals for Tabular Machine Learning

arXiv:2603.13998v1 Announce Type: new Abstract: While graph-derived signals are widely used in tabular learning, existing studies typically rely on limited experimental setups and average performance …

Mario Heidrich, Jeffrey Heidemann, R\"udiger Buchkremer, Gonzalo Wandosell Fern\'andez de Bobadilla

53 views Mar 17

Academic · 1 min

DeceptGuard :A Constitutional Oversight Framework For Detecting Deception in LLM Agents

arXiv:2603.13791v1 Announce Type: new Abstract: Reliable detection of deceptive behavior in Large Language Model (LLM) agents is an essential prerequisite for safe deployment in high-stakes …

Snehasis Mukhopadhyay

136 views Mar 17

← Previous

130 131 132 133 134

Academic

EviAgent: Evidence-Driven Agent for Radiology Report Generation

Optimizing LLM Annotation of Classroom Discourse through Multi-Agent Orchestration

Generate Then Correct: Single Shot Global Correction for Aspect Sentiment Quad Prediction

From Refusal Tokens to Refusal Control: Discovering and Steering Category-Specific Refusal Directions

QuarkMedBench: A Real-World Scenario Driven Benchmark for Evaluating Large Language Models

Intelligent Materials Modelling: Large Language Models Versus Partial Least Squares Regression for Predicting Polysulfone Membrane …

vla-eval: A Unified Evaluation Harness for Vision-Language-Action Models

Benchmarking Large Language Models on Reference Extraction and Parsing in the Social Sciences and Humanities

DyACE: Dynamic Algorithm Co-evolution for Online Automated Heuristic Design with Large Language Model

Widespread Gender and Pronoun Bias in Moral Judgments Across LLMs

A Systematic Evaluation Protocol of Graph-Derived Signals for Tabular Machine Learning

DeceptGuard :A Constitutional Oversight Framework For Detecting Deception in LLM Agents

JCG, PC

HSOLLC Co., Ltd.