Academic

Academic · 1 min

MzansiText and MzansiLM: An Open Corpus and Decoder-Only Language Model for South African Languages

arXiv:2603.20732v1 Announce Type: new Abstract: Decoder-only language models can be adapted to diverse tasks through instruction finetuning, but the extent to which this generalizes at …

Anri Lombard, Simbarashe Mawere, Temi Aina, Ethan Wolff, Sbonelo Gumede, Elan Novick, Francois Meyer, Jan Buys

3 views Mar 24

Academic · 1 min

Code-MIE: A Code-style Model for Multimodal Information Extraction with Scene Graph and Entity Attribute Knowledge …

arXiv:2603.20781v1 Announce Type: new Abstract: With the rapid development of large language models (LLMs), more and more researchers have paid attention to information extraction based …

Jiang Liu, Ge Qiu, Hao Fei, Dongdong Xie, Jinbo Li, Fei Li, Chong Teng, Donghong Ji

3 views Mar 24

Academic · 1 min

The Anatomy of an Edit: Mechanism-Guided Activation Steering for Knowledge Editing

arXiv:2603.20795v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly used as knowledge bases, but keeping them up to date requires targeted knowledge editing …

Yuan Cao, Mingyang Wang, Hinrich Sch\"utze

3 views Mar 24

Academic · 1 min

RLVR Training of LLMs Does Not Improve Thinking Ability for General QA: Evaluation Method and …

arXiv:2603.20799v1 Announce Type: new Abstract: Reinforcement learning from verifiable rewards (RLVR) stimulates the thinking processes of large language models (LLMs), substantially enhancing their reasoning abilities …

Kaiyuan Li, Jing-Cheng Pang, Yang Yu

3 views Mar 24

Academic · 1 min

BenchBench: Benchmarking Automated Benchmark Generation

arXiv:2603.20807v1 Announce Type: new Abstract: Benchmarks are the de facto standard for tracking progress in large language models (LLMs), yet static test sets can rapidly …

Yandan Zheng, Haoran Luo, Zhenghong Lin, Wenjin Liu, Luu Anh Tuan

5 views Mar 24

Academic · 1 min

HiCI: Hierarchical Construction-Integration for Long-Context Attention

arXiv:2603.20843v1 Announce Type: new Abstract: Long-context language modeling is commonly framed as a scalability challenge of token-level attention, yet local-to-global information structuring remains largely implicit …

Xiangyu Zeng, Qi Xu, Yunke Wang, Chang Xu

2 views Mar 24

Academic · 1 min

Can ChatGPT Really Understand Modern Chinese Poetry?

arXiv:2603.20851v1 Announce Type: new Abstract: ChatGPT has demonstrated remarkable capabilities on both poetry generation and translation, yet its ability to truly understand poetry remains unexplored. …

Shanshan Wang, Derek F. Wong, Jingming Yao, Lidia S. Chao

2 views Mar 24

Academic · 1 min

SozKZ: Training Efficient Small Language Models for Kazakh from Scratch

arXiv:2603.20854v1 Announce Type: new Abstract: Kazakh, a Turkic language spoken by over 22 million people, remains underserved by existing multilingual language models, which allocate minimal …

Saken Tukenov

2 views Mar 24

Academic · 1 min

NoveltyAgent: Autonomous Novelty Reporting Agent with Point-wise Novelty Analysis and Self-Validation

arXiv:2603.20884v1 Announce Type: new Abstract: The exponential growth of academic publications has led to a surge in papers of varying quality, increasing the cost of …

Jiajun Hou, Hexuan Deng, Wenxiang Jiao, Xuebo Liu, Xiaopeng Ke, Min Zhang

3 views Mar 24

Academic · 1 min

LLM Router: Prefill is All You Need

arXiv:2603.20895v1 Announce Type: new Abstract: LLMs often share comparable benchmark accuracies, but their complementary performance across task subsets suggests that an Oracle router--a theoretical selector …

Tanay Varshney, Annie Surla, Michelle Xu, Gomathy Venkata Krishnan, Maximilian Jeblick, David Austin, Neal Vaidya, Davide Onofrio

3 views Mar 24

Academic · 1 min

Mitigating Shortcut Reasoning in Language Models: A Gradient-Aware Training Approach

arXiv:2603.20899v1 Announce Type: new Abstract: Large language models exhibit strong reasoning capabilities, yet often rely on shortcuts such as surface pattern matching and answer memorization …

Hongyu Cao, Kunpeng Liu, Dongjie Wang, Yanjie Fu

3 views Mar 24

Academic · 1 min

The Hidden Puppet Master: A Theoretical and Real-World Account of Emotional Manipulation in LLMs

arXiv:2603.20907v1 Announce Type: new Abstract: As users increasingly turn to LLMs for practical and personal advice, they become vulnerable to being subtly steered toward hidden …

Jocelyn Shen, Amina Luvsanchultem, Jessica Kim, Kynnedy Smith, Valdemar Danry, Kantwon Rogers, Sharifa Alghowinem, Hae Won Park, Maarten Sap, Cynthia Breazeal

3 views Mar 24

MzansiText and MzansiLM: An Open Corpus and Decoder-Only Language Model for South African Languages

Code-MIE: A Code-style Model for Multimodal Information Extraction with Scene Graph and Entity Attribute Knowledge …

The Anatomy of an Edit: Mechanism-Guided Activation Steering for Knowledge Editing

RLVR Training of LLMs Does Not Improve Thinking Ability for General QA: Evaluation Method and …

BenchBench: Benchmarking Automated Benchmark Generation

HiCI: Hierarchical Construction-Integration for Long-Context Attention

Can ChatGPT Really Understand Modern Chinese Poetry?

SozKZ: Training Efficient Small Language Models for Kazakh from Scratch

NoveltyAgent: Autonomous Novelty Reporting Agent with Point-wise Novelty Analysis and Self-Validation

LLM Router: Prefill is All You Need

Mitigating Shortcut Reasoning in Language Models: A Gradient-Aware Training Approach

The Hidden Puppet Master: A Theoretical and Real-World Account of Emotional Manipulation in LLMs

JCG, PC

HSOLLC Co., Ltd.