AsgardBench - Evaluating Visually Grounded Interactive Planning Under Minimal Feedback
arXiv:2603.15888v1 Announce Type: new Abstract: With AsgardBench we aim to evaluate visually grounded, high-level action sequence generation and interactive planning, focusing specifically on plan adaptation …
Andrea Tupini, Lars Liden, Reuben Tan, Yu Wang, Jianfeng Gao
11 views