UniDial-EvalKit: A Unified Toolkit for Evaluating Multi-Faceted Conversational Abilities
arXiv:2603.23160v1 Announce Type: new Abstract: Benchmarking AI systems in multi-turn interactive scenarios is essential for understanding their practical capabilities in real-world applications. However, existing evaluation …
Qi Jia, Haodong Zhao, Dun Pei, Xiujie Song, Shibo Wang, Zijian Chen, Zicheng Zhang, Xiangyang Zhu, Guangtao Zhai
1 views