Academic

Academic

Academic · 1 min

RAT-Bench: A Comprehensive Benchmark for Text Anonymization

arXiv:2602.12806v1 Announce Type: new Abstract: Data containing personal information is increasingly used to train, fine-tune, or query Large Language Models (LLMs). Text is typically scrubbed …

Nata\v{s}a Kr\v{c}o, Zexi Yao, Matthieu Meeus, Yves-Alexandre de Montjoye
20 views
Academic · 1 min

ViMedCSS: A Vietnamese Medical Code-Switching Speech Dataset & Benchmark

arXiv:2602.12911v1 Announce Type: new Abstract: Code-switching (CS), which is when Vietnamese speech uses English words like drug names or procedures, is a common phenomenon in …

Tung X. Nguyen, Nhu Vo, Giang-Son Nguyen, Duy Mai Hoang, Chien Dinh Huynh, Inigo Jauregi Unanue, Massimo Piccardi, Wray Buntine, Dung D. Le
3 views
Academic · 1 min

ProbeLLM: Automating Principled Diagnosis of LLM Failures

arXiv:2602.12966v1 Announce Type: new Abstract: Understanding how and why large language models (LLMs) fail is becoming a central challenge as models rapidly evolve and static …

Yue Huang, Zhengzhe Jiang, Yuchen Ma, Yu Jiang, Xiangqi Wang, Yujun Zhou, Yuexing Hao, Kehan Guo, Pin-Yu Chen, Stefan Feuerriegel, Xiangliang Zhang
27 views