ItinBench: Benchmarking Planning Across Multiple Cognitive Dimensions with Large Language Models
arXiv:2603.19515v1 Announce Type: new Abstract: Large language models (LLMs) with advanced cognitive capabilities are emerging as agents for various reasoning and planning tasks. Traditional evaluations …
Tianlong Wang, Pinqiao Wang, Weili Shi, Sheng li
21 views