F

Finnur \'Ag\'ust Ingimundarson, Steinunn Rut Fri{\dh}riksd\'ottir, Bjarki \'Armannsson, Iris Edda Nowenstein, Stein{\th}\'or Steingr\'imsson

Articles by Finnur \'Ag\'ust Ingimundarson, Steinunn Rut Fri{\dh}riksd\'ottir, Bjarki \'Armannsson, Iris Edda Nowenstein, Stein{\th}\'or Steingr\'imsson

Academic · 1 min

Who Benchmarks the Benchmarks? A Case Study of LLM Evaluation in Icelandic

arXiv:2603.16406v1 Announce Type: new Abstract: This paper evaluates current Large Language Model (LLM) benchmarking for Icelandic, identifies problems, and calls for improved evaluation methods in …

Finnur \'Ag\'ust Ingimundarson, Steinunn Rut Fri{\dh}riksd\'ottir, Bjarki \'Armannsson, Iris Edda Nowenstein, Stein{\th}\'or Steingr\'imsson
10 views