UniDial-EvalKit: A Unified Toolkit for Evaluating Multi-Faceted Conversational Abilities
arXiv:2603.23160v1 Announce Type: new Abstract: Benchmarking AI systems in multi-turn interactive scenarios is essential for understanding their practical capabilities in real-world applications. However, existing evaluation …