Functorial Neural Architectures from Higher Inductive Types
arXiv:2603.16123v1 Announce Type: new Abstract: Neural networks systematically fail at compositional generalization -- producing correct outputs for novel combinations of known parts. We show that this failure is architectural: compositional generalization is equivalent to functoriality of the decoder, and this perspective yields both guarantees and impossibility results. We compile Higher Inductive Type (HIT) specifications into neural architectures via a monoidal functor from the path groupoid of a target space to a category of parametric maps: path constructors become generator networks, composition becomes structural concatenation, and 2-cells witnessing group relations become learned natural transformations. We prove that decoders assembled by structural concatenation of independently generated segments are strict monoidal functors (compositional by construction), while softmax self-attention is not functorial for any non-trivial compositional task. Both results are formalized in
arXiv:2603.16123v1 Announce Type: new Abstract: Neural networks systematically fail at compositional generalization -- producing correct outputs for novel combinations of known parts. We show that this failure is architectural: compositional generalization is equivalent to functoriality of the decoder, and this perspective yields both guarantees and impossibility results. We compile Higher Inductive Type (HIT) specifications into neural architectures via a monoidal functor from the path groupoid of a target space to a category of parametric maps: path constructors become generator networks, composition becomes structural concatenation, and 2-cells witnessing group relations become learned natural transformations. We prove that decoders assembled by structural concatenation of independently generated segments are strict monoidal functors (compositional by construction), while softmax self-attention is not functorial for any non-trivial compositional task. Both results are formalized in Cubical Agda. Experiments on three spaces validate the full hierarchy: on the torus ($\mathbb{Z}^2$), functorial decoders outperform non-functorial ones by 2-2.7x; on $S^1 \vee S^1$ ($F_2$), the type-A/B gap widens to 5.5-10x; on the Klein bottle ($\mathbb{Z} \rtimes \mathbb{Z}$), a learned 2-cell closes a 46% error gap on words exercising the group relation.
Executive Summary
This article presents a novel approach to addressing compositional generalization in neural networks by leveraging Higher Inductive Type (HIT) specifications and functoriality. The authors demonstrate that compositional generalization is equivalent to functoriality of the decoder, and develop a monoidal functor to compile HIT specifications into neural architectures. Experiments on three spaces validate the effectiveness of functorial decoders, with significant performance improvements over non-functorial ones. The authors provide both theoretical guarantees and impossibility results, formally proven in Cubical Agda. This work has significant implications for the development of more robust and generalizable neural networks, and opens up new avenues for research in this area.
Key Points
- ▸ Compositional generalization is equivalent to functoriality of the decoder.
- ▸ HIT specifications can be compiled into neural architectures via a monoidal functor.
- ▸ Functorial decoders outperform non-functorial ones in experiments on three spaces.
Merits
Strength
The article provides a novel and theoretically sound approach to addressing compositional generalization, with significant performance improvements over existing methods.
Demerits
Limitation
The article assumes a high level of mathematical sophistication, which may limit its accessibility to researchers without a strong background in category theory and type theory.
Expert Commentary
The article presents a significant contribution to the field of neural network research, and provides a new approach to addressing a key challenge in the development of more robust and generalizable neural networks. The use of HIT specifications and functoriality provides a novel and theoretically sound framework for compiling neural architectures, and the experiments demonstrate the effectiveness of this approach. However, the article assumes a high level of mathematical sophistication, which may limit its accessibility to researchers without a strong background in category theory and type theory. Nevertheless, the article's results have significant implications for both research and policy, and are likely to be of interest to experts in the field.
Recommendations
- ✓ Further research is needed to explore the applicability of this approach to other areas of neural network research.
- ✓ The article's results should be replicated and extended to other domains and tasks.