J

Jingyu Peng, Hongyu Chen, Jiancheng Dong, Maolin Wang, Wenxi Li, Yuchen Li, Kai Zhang, Xiangyu Zhao

Articles by Jingyu Peng, Hongyu Chen, Jiancheng Dong, Maolin Wang, Wenxi Li, Yuchen Li, Kai Zhang, Xiangyu Zhao

Academic · 1 min

MOSAIC: Composable Safety Alignment with Modular Control Tokens

arXiv:2603.16210v1 Announce Type: new Abstract: Safety alignment in large language models (LLMs) is commonly implemented as a single static policy embedded in model parameters. However, …

Jingyu Peng, Hongyu Chen, Jiancheng Dong, Maolin Wang, Wenxi Li, Yuchen Li, Kai Zhang, Xiangyu Zhao
5 views