Academic

Academic · 1 min

MiCA Learns More Knowledge Than LoRA and Full Fine-Tuning

arXiv:2604.01694v1 Announce Type: new Abstract: Minor Component Adaptation (MiCA) is a novel parameter-efficient fine-tuning method for large language models that focuses on adapting underutilized subspaces …

Sten R\"udiger, Sebastian Raschka

3 views Apr 3

Academic · 1 min

Coupled Query-Key Dynamics for Attention

arXiv:2604.01683v1 Announce Type: new Abstract: Standard scaled dot-product attention computes scores from static, independent projections of the input. We show that evolving queries and keys …

Barak Gahtan, Alex M. Bronstein

5 views Apr 3

Academic · 1 min

Cognitive Energy Modeling for Neuroadaptive Human-Machine Systems using EEG and WGAN-GP

arXiv:2604.01653v1 Announce Type: new Abstract: Electroencephalography (EEG) provides a non-invasive insight into the brain's cognitive and emotional dynamics. However, modeling how these states evolve in …

Sriram Sattiraju, Vaibhav Gollapalli, Aryan Shah, Timothy McMahan

2 views Apr 3

Academic · 1 min

Label Shift Estimation With Incremental Prior Update

arXiv:2604.01651v1 Announce Type: new Abstract: An assumption often made in supervised learning is that the training and testing sets have the same label distribution. However, …

Yunrui Zhang, Gustavo Batista, Salil S. Kanhere

1 views Apr 3

Academic · 1 min

CRIT: Graph-Based Automatic Data Synthesis to Enhance Cross-Modal Multi-Hop Reasoning

arXiv:2604.01634v1 Announce Type: new Abstract: Real-world reasoning often requires combining information across modalities, connecting textual context with visual cues in a multi-hop process. Yet, most …

Junyoung Sung, Seungwoo Lyu, Minjun Kim, Sumin An, Arsha Nagrani, Paul Hongsuck Seo

1 views Apr 3

Academic · 1 min

Expert-Choice Routing Enables Adaptive Computation in Diffusion Language Models

arXiv:2604.01622v1 Announce Type: new Abstract: Diffusion language models (DLMs) enable parallel, non-autoregressive text generation, yet existing DLM mixture-of-experts (MoE) models inherit token-choice (TC) routing from …

Shuibai Zhang, Caspian Zhuang, Chihan Cui, Zhihan Yang, Fred Zhangzhi Peng, Yanxin Zhang, Haoyue Bai, Zack Jia, Yang Zhou, Guanhua Chen, Ming Liu

1 views Apr 3

Academic · 1 min

Pseudo-Quantized Actor-Critic Algorithm for Robustness to Noisy Temporal Difference Error

arXiv:2604.01613v1 Announce Type: new Abstract: In reinforcement learning (RL), temporal difference (TD) errors are widely adopted for optimizing value and policy functions. However, since the …

Taisuke Kobayashi

7 views Apr 3

Academic · 1 min

Training In-Context and In-Weights Mixtures Via Contrastive Context Sampling

arXiv:2604.01601v1 Announce Type: new Abstract: We investigate training strategies that co-develop in-context learning (ICL) and in-weights learning (IWL), and the ability to switch between them …

Deeptanshu Malu, Deevyanshu Malu, Aditya Nemiwal, Sunita Sarawagi

1 views Apr 3

Academic · 1 min

Learning from the Right Rollouts: Data Attribution for PPO-based LLM Post-Training

arXiv:2604.01597v1 Announce Type: new Abstract: Traditional RL algorithms like Proximal Policy Optimization (PPO) typically train on the entire rollout buffer, operating under the assumption that …

Dong Shu, Denghui Zhang, Jessica Hullman

10 views Apr 3

Academic · 1 min

Optimizing EEG Graph Structure for Seizure Detection: An Information Bottleneck and Self-Supervised Learning Approach

arXiv:2604.01595v1 Announce Type: new Abstract: Seizure detection from EEG signals is highly challenging due to complex spatiotemporal dynamics and extreme inter-patient variability. To model them, …

Lincan Li, Rikuto Kotoge, Xihao Piao, Zheng Chen, Yushun Dong

1 views Apr 3

Academic · 1 min

Variational LSTM with Augmented Inputs: Nonlinear Response History Metamodeling with Aleatoric and Epistemic Uncertainty

arXiv:2604.01587v1 Announce Type: new Abstract: Uncertainty propagation in high-dimensional nonlinear dynamic structural systems is pivotal in state-of-the-art performance-based design and risk assessment, where uncertainties from …

Manisha Sapkota, Min Li, Bowei Li

2 views Apr 3

Academic · 1 min

Thinking While Listening: Fast-Slow Recurrence for Long-Horizon Sequential Modeling

arXiv:2604.01577v1 Announce Type: new Abstract: We extend the recent latent recurrent modeling to sequential input streams. By interleaving fast, recurrent latent updates with self-organizational ability …

Shota Takashiro, Masanori Koyama, Takeru Miyato, Yusuke Iwasawa, Yutaka Matsuo, Kohei Hayashi

1 views Apr 3

MiCA Learns More Knowledge Than LoRA and Full Fine-Tuning

Coupled Query-Key Dynamics for Attention

Cognitive Energy Modeling for Neuroadaptive Human-Machine Systems using EEG and WGAN-GP

Label Shift Estimation With Incremental Prior Update

CRIT: Graph-Based Automatic Data Synthesis to Enhance Cross-Modal Multi-Hop Reasoning

Expert-Choice Routing Enables Adaptive Computation in Diffusion Language Models

Pseudo-Quantized Actor-Critic Algorithm for Robustness to Noisy Temporal Difference Error

Training In-Context and In-Weights Mixtures Via Contrastive Context Sampling

Learning from the Right Rollouts: Data Attribution for PPO-based LLM Post-Training

Optimizing EEG Graph Structure for Seizure Detection: An Information Bottleneck and Self-Supervised Learning Approach

Variational LSTM with Augmented Inputs: Nonlinear Response History Metamodeling with Aleatoric and Epistemic Uncertainty

Thinking While Listening: Fast-Slow Recurrence for Long-Horizon Sequential Modeling

JCG, PC

HSOLLC Co., Ltd.