Engineering Verifiable Modularity in Transformers via Per-Layer Supervision
arXiv:2603.18029v1 Announce Type: new Abstract: Transformers resist surgical control. Ablating an attention head identified as critical for capitalization produces minimal behavioral change because distributed redundancy …
J. Clayton Kerce
6 views