Tag: cs.CV

#cs.CV

Latest First Most Viewed Alphabetical

All Conference (266) Law Review (314) Academic (4957) Think Tank (60) News (791) Journal (139) Technology & AI (4) Business & Strategy (1) Finance & Economics (2) Legal & Compliance (1) Innovation & Research (0) International Affairs (2) Cybersecurity (2) Healthcare & Biotech (2)

Academic · 1 min

When to Call an Apple Red: Humans Follow Introspective Rules, VLMs Don't

arXiv:2604.06422v1 Announce Type: new Abstract: Understanding when Vision-Language Models (VLMs) will behave unexpectedly, whether models can reliably predict their own behavior, and if models adhere …

Jonathan Nemitz, Carsten Eickhoff, Junyi Jessy Li, Kyle Mahowald, Michal Golovanevsky, William Rudman

57 views Apr 9

Academic · 1 min

SubFLOT: Submodel Extraction for Efficient and Personalized Federated Learning via Optimal Transport

arXiv:2604.06631v1 Announce Type: new Abstract: Federated Learning (FL) enables collaborative model training while preserving data privacy, but its practical deployment is hampered by system and …

Zheng Jiang, Nan He, Yiming Chen, Lifeng Sun

73 views Apr 9

Academic · 1 min

Drifting Fields are not Conservative

arXiv:2604.06333v1 Announce Type: new Abstract: Drifting models generate high-quality samples in a single forward pass by transporting generated samples toward the data distribution using a …

Leonard Franz, Sebastian Hoffmann, Georg Martius

57 views Apr 9

Academic · 1 min

Bi-Level Optimization for Single Domain Generalization

arXiv:2604.06349v1 Announce Type: new Abstract: Generalizing from a single labeled source domain to unseen target domains, without access to any target data during training, remains …

Marzi Heidari, Hanping Zhang, Hao Yan, Yuhong Guo

47 views Apr 9

Academic · 1 min

Thinking Diffusion: Penalize and Guide Visual-Grounded Reasoning in Diffusion Multimodal Language Models

arXiv:2604.05497v1 Announce Type: new Abstract: Diffusion large language models (dLLMs) are emerging as promising alternatives to autoregressive (AR) LLMs. Recently, this paradigm has been extended …

Keuntae Kim, Mingyu Kang, Yong Suk Choi

39 views Apr 8

Academic · 1 min

Part-Level 3D Gaussian Vehicle Generation with Joint and Hinge Axis Estimation

arXiv:2604.05070v1 Announce Type: new Abstract: Simulation is essential for autonomous driving, yet current frameworks often model vehicles as rigid assets and fail to capture part-level …

Shiyao Qian, Yuan Ren, Dongfeng Bai, Bingbing Liu

38 views Apr 8

Academic · 1 min

Learning What Matters: Dynamic Dimension Selection and Aggregation for Interpretable Vision-Language Reward Modeling

arXiv:2604.05445v1 Announce Type: new Abstract: Vision-language reward modeling faces a dilemma: generative approaches are interpretable but slow, while discriminative ones are efficient but act as …

Qiyuan Chen, Hongsen Huang, Jiahe Chen, Qian Shao, Jintai Chen, Hongxia Xu, Renjie Hua, Chuan Ren, Jian Wu

34 views Apr 8

Academic · 1 min

ICR-Drive: Instruction Counterfactual Robustness for End-to-End Language-Driven Autonomous Driving

arXiv:2604.05378v1 Announce Type: new Abstract: Recent progress in vision-language-action (VLA) models has enabled language-conditioned driving agents to execute natural-language navigation commands in closed-loop simulation, yet …

Kaiser Hamid, Can Cui, Nade Liang

39 views Apr 8

Academic · 1 min

Training Without Orthogonalization, Inference With SVD: A Gradient Analysis of Rotation Representations

arXiv:2604.05414v1 Announce Type: new Abstract: Recent work has shown that removing orthogonalization during training and applying it only at inference improves rotation estimation in deep …

Chris Choy

48 views Apr 8

Academic · 1 min

Supervised Dimensionality Reduction Revisited: Why LDA on Frozen CNN Features Deserves a Second Look

arXiv:2604.03928v1 Announce Type: new Abstract: Effective ride-hailing dispatch requires anticipating demand patterns that vary substantially across time-of-day, day-of-week, season, and special events. We propose a …

Indar Kumar, Girish Karhana, Sai Krishna Jasti, Ankit Hemant Lade

40 views Apr 7

Academic · 1 min

LiME: Lightweight Mixture of Experts for Efficient Multimodal Multi-task Learning

arXiv:2604.02338v1 Announce Type: new Abstract: MoE-PEFT methods combine Mixture of Experts with parameter-efficient fine-tuning for multi-task adaptation, but require separate adapters per expert causing trainable …

Md Kowsher, Haris Mansoor, Nusrat Jahan Prottasha, Ozlem Garibay, Victor Zhu, Zhengping Ji, Chen Chen

34 views Apr 6

Academic · 1 min

From Broad Exploration to Stable Synthesis: Entropy-Guided Optimization for Autoregressive Image Generation

arXiv:2604.02355v1 Announce Type: new Abstract: Combining Chain-of-Thought (CoT) with Reinforcement Learning (RL) improves text-to-image (T2I) generation, yet the underlying interaction between CoT's exploration and RL's …

Han Song, Yucheng Zhou, Jianbing Shen, Yu Cheng

27 views Apr 6

1 2 3

#cs.CV

When to Call an Apple Red: Humans Follow Introspective Rules, VLMs Don't

SubFLOT: Submodel Extraction for Efficient and Personalized Federated Learning via Optimal Transport

Drifting Fields are not Conservative

Bi-Level Optimization for Single Domain Generalization

Thinking Diffusion: Penalize and Guide Visual-Grounded Reasoning in Diffusion Multimodal Language Models

Part-Level 3D Gaussian Vehicle Generation with Joint and Hinge Axis Estimation

Learning What Matters: Dynamic Dimension Selection and Aggregation for Interpretable Vision-Language Reward Modeling

ICR-Drive: Instruction Counterfactual Robustness for End-to-End Language-Driven Autonomous Driving

Training Without Orthogonalization, Inference With SVD: A Gradient Analysis of Rotation Representations

Supervised Dimensionality Reduction Revisited: Why LDA on Frozen CNN Features Deserves a Second Look

LiME: Lightweight Mixture of Experts for Efficient Multimodal Multi-task Learning

From Broad Exploration to Stable Synthesis: Entropy-Guided Optimization for Autoregressive Image Generation

JCG, PC

HSOLLC Co., Ltd.