Mitigating Selection Bias in Large Language Models via Permutation-Aware GRPO
arXiv:2603.21016v1 Announce Type: new Abstract: Large language models (LLMs) used for multiple-choice and pairwise evaluation tasks often exhibit selection bias due to non-semantic factors like …
Jinquan Zheng, Jia Yuan, Jiacheng Yao, Chenyang Gu, Pujun Zheng, Guoxiu He
3 views