Academic

Academic

Academic · 1 min

PyVision-RL: Forging Open Agentic Vision Models via RL

arXiv:2602.20739v1 Announce Type: new Abstract: Reinforcement learning for agentic multimodal models often suffers from interaction collapse, where models learn to reduce tool usage and multi-turn …

Shitian Zhao, Shaoheng Lin, Ming Li, Haoquan Zhang, Wenshuo Peng, Kaipeng Zhang, Chen Wei
12 views
Academic · 1 min

Pipeline for Verifying LLM-Generated Mathematical Solutions

arXiv:2602.20770v1 Announce Type: new Abstract: With the growing popularity of Large Reasoning Models and their results in solving mathematical problems, it becomes crucial to measure …

Varvara Sazonova, Dmitri Shmelkin, Stanislav Kikot, Vasily Motolygin
19 views
Academic · 1 min

POMDPPlanners: Open-Source Package for POMDP Planning

arXiv:2602.20810v1 Announce Type: new Abstract: We present POMDPPlanners, an open-source Python package for empirical evaluation of Partially Observable Markov Decision Process (POMDP) planning algorithms. The …

Yaacov Pariente, Vadim Indelman
20 views
Academic · 1 min

Predicting Sentence Acceptability Judgments in Multimodal Contexts

arXiv:2602.20918v1 Announce Type: new Abstract: Previous work has examined the capacity of deep neural networks (DNNs), particularly transformers, to predict human sentence acceptability judgments, both …

Hyewon Jang, Nikolai Ilinykh, Sharid Lo\'aiciga, Jey Han Lau, Shalom Lappin
24 views