Academic

Academic · 1 min

IndexCache: Accelerating Sparse Attention via Cross-Layer Index Reuse

arXiv:2603.12201v1 Announce Type: new Abstract: Long-context agentic workflows have emerged as a defining use case for large language models, making attention efficiency critical for both …

Yushi Bai, Qian Dong, Ting Jiang, Xin Lv, Zhengxiao Du, Aohan Zeng, Jie Tang, Juanzi Li

11 views Mar 13

Academic · 1 min

CLASP: Defending Hybrid Large Language Models Against Hidden State Poisoning Attacks

arXiv:2603.12206v1 Announce Type: new Abstract: State space models (SSMs) like Mamba have gained significant traction as efficient alternatives to Transformers, achieving linear complexity while maintaining …

Alexandre Le Mercier, Thomas Demeester, Chris Develder

15 views Mar 13

Academic · 1 min

Sparking Scientific Creativity via LLM-Driven Interdisciplinary Inspiration

arXiv:2603.12226v1 Announce Type: new Abstract: Despite interdisciplinary research leading to larger and longer-term impact, most work remains confined to single-domain academic silos. Recent AI-based approaches …

Priyanka Kargupta, Shuhaib Mehri, Dilek Hakkani-Tur, Jiawei Han

10 views Mar 13

Academic · 1 min

Comparison of Outlier Detection Algorithms on String Data

arXiv:2603.11049v1 Announce Type: new Abstract: Outlier detection is a well-researched and crucial problem in machine learning. However, there is little research on string data outlier …

Philip Maus

11 views Mar 13

Academic · 1 min

Structure-Aware Epistemic Uncertainty Quantification for Neural Operator PDE Surrogates

arXiv:2603.11052v1 Announce Type: new Abstract: Neural operators (NOs) provide fast, resolution-invariant surrogates for mapping input fields to PDE solution fields, but their predictions can exhibit …

Haoze Song, Zhihao Li, Mengyi Deng, Xin Li, Duyi Pan, Zhilu Lai, Wei Wang

11 views Mar 13

Academic · 1 min

Interventional Time Series Priors for Causal Foundation Models

arXiv:2603.11090v1 Announce Type: new Abstract: Prior-data fitted networks (PFNs) have emerged as powerful foundation models for tabular causal inference, yet their extension to time series …

Dennis Thumm, Ying Chen

24 views Mar 13

Academic · 1 min

Fingerprinting Concepts in Data Streams with Supervised and Unsupervised Meta-Information

arXiv:2603.11094v1 Announce Type: new Abstract: Streaming sources of data are becoming more common as the ability to collect data in real-time grows. A major concern …

Ben Halstead, Yun Sing Koh, Patricia Riddle, Mykola Pechenizkiy, Albert Bifet, Russel Pears

9 views Mar 13

Academic · 1 min

Graph Tokenization for Bridging Graphs and Transformers

arXiv:2603.11099v1 Announce Type: new Abstract: The success of large pretrained Transformers is closely tied to tokenizers, which convert raw input into discrete symbols. Extending these …

Zeyuan Guo, Enmao Diao, Cheng Yang, Chuan Shi

11 views Mar 13

Academic · 1 min

Task-Conditioned Routing Signatures in Sparse Mixture-of-Experts Transformers

arXiv:2603.11114v1 Announce Type: new Abstract: Sparse Mixture-of-Experts (MoE) architectures enable efficient scaling of large language models through conditional computation, yet the routing mechanisms responsible for …

Mynampati Sri Ranganadha Avinash

10 views Mar 13

Academic · 1 min

Learning Tree-Based Models with Gradient Descent

arXiv:2603.11117v1 Announce Type: new Abstract: Tree-based models are widely recognized for their interpretability and have proven effective in various application domains, particularly in high-stakes domains. …

Sascha Marton

13 views Mar 13

Academic · 1 min

A Learning-Based Superposition Operator for Non-Renewal Arrival Processes in Queueing Networks

arXiv:2603.11118v1 Announce Type: new Abstract: The superposition of arrival processes is a fundamental yet analytically intractable operation in queueing networks when inputs are general non-renewal …

Eliran Sherzer

29 views Mar 13

Academic · 1 min

Group Resonance Network: Learnable Prototypes and Multi-Subject Resonance for EEG Emotion Recognition

arXiv:2603.11119v1 Announce Type: new Abstract: Electroencephalography(EEG)-basedemotionrecognitionre- mains challenging in cross-subject settings due to severe inter-subject variability. Existing methods mainly learn subject-invariant features, but often under-exploit …

Renwei Meng

18 views Mar 13

IndexCache: Accelerating Sparse Attention via Cross-Layer Index Reuse

CLASP: Defending Hybrid Large Language Models Against Hidden State Poisoning Attacks

Sparking Scientific Creativity via LLM-Driven Interdisciplinary Inspiration

Comparison of Outlier Detection Algorithms on String Data

Structure-Aware Epistemic Uncertainty Quantification for Neural Operator PDE Surrogates

Interventional Time Series Priors for Causal Foundation Models

Fingerprinting Concepts in Data Streams with Supervised and Unsupervised Meta-Information

Graph Tokenization for Bridging Graphs and Transformers

Task-Conditioned Routing Signatures in Sparse Mixture-of-Experts Transformers

Learning Tree-Based Models with Gradient Descent

A Learning-Based Superposition Operator for Non-Renewal Arrival Processes in Queueing Networks

Group Resonance Network: Learnable Prototypes and Multi-Subject Resonance for EEG Emotion Recognition

JCG, PC

HSOLLC Co., Ltd.