Academic

SymCircuit: Bayesian Structure Inference for Tractable Probabilistic Circuits via Entropy-Regularized Reinforcement Learning

arXiv:2603.20392v1 Announce Type: new Abstract: Probabilistic circuit (PC) structure learning is hampered by greedy algorithms that make irreversible, locally optimal decisions. We propose SymCircuit, which replaces greedy search with a learned generative policy trained via entropy-regularized reinforcement learning. Instantiating the RL-as-inference framework in the PC domain, we show the optimal policy is a tempered Bayesian posterior, recovering the exact posterior when the regularization temperature is set inversely proportional to the dataset size. The policy is implemented as SymFormer, a grammar-constrained autoregressive Transformer with tree-relative self-attention that guarantees valid circuits at every generation step. We introduce option-level REINFORCE, restricting gradient updates to structural decisions rather than all tokens, yielding an SNR (signal to noise ratio) improvement and >10 times sample efficiency gain on the NLTCS dataset. A three-layer uncertainty decompos

Y
Y. Sungtaek Ju
· · 1 min read · 26 views

arXiv:2603.20392v1 Announce Type: new Abstract: Probabilistic circuit (PC) structure learning is hampered by greedy algorithms that make irreversible, locally optimal decisions. We propose SymCircuit, which replaces greedy search with a learned generative policy trained via entropy-regularized reinforcement learning. Instantiating the RL-as-inference framework in the PC domain, we show the optimal policy is a tempered Bayesian posterior, recovering the exact posterior when the regularization temperature is set inversely proportional to the dataset size. The policy is implemented as SymFormer, a grammar-constrained autoregressive Transformer with tree-relative self-attention that guarantees valid circuits at every generation step. We introduce option-level REINFORCE, restricting gradient updates to structural decisions rather than all tokens, yielding an SNR (signal to noise ratio) improvement and >10 times sample efficiency gain on the NLTCS dataset. A three-layer uncertainty decomposition (structural via model averaging, parametric via the delta method, leaf via conjugate Dirichlet-Categorical propagation) is grounded in the multilinear polynomial structure of PC outputs. On NLTCS, SymCircuit closes 93% of the gap to LearnSPN; preliminary results on Plants (69 variables) suggest scalability.

Executive Summary

This article proposes SymCircuit, a Bayesian structure inference method for probabilistic circuits (PCs) that leverages entropy-regularized reinforcement learning (RL) to learn a generative policy. By instantiating the RL-as-inference framework in the PC domain, SymCircuit recovers the exact posterior when the regularization temperature is inversely proportional to the dataset size. The method is implemented as SymFormer, a grammar-constrained autoregressive Transformer with tree-relative self-attention, and achieves significant improvements in signal-to-noise ratio (SNR) and sample efficiency. The article also introduces a three-layer uncertainty decomposition grounded in the multilinear polynomial structure of PC outputs. The proposed approach demonstrates promising scalability and closes a significant gap to state-of-the-art methods on benchmark datasets.

Key Points

  • SymCircuit leverages entropy-regularized RL to learn a generative policy for PC structure inference.
  • SymFormer is a grammar-constrained autoregressive Transformer with tree-relative self-attention that guarantees valid circuits at every generation step.
  • SymCircuit achieves significant improvements in SNR and sample efficiency on benchmark datasets.

Merits

Strength in Scalability

SymCircuit demonstrates promising scalability, closing a significant gap to state-of-the-art methods on benchmark datasets.

Robust Uncertainty Decomposition

The three-layer uncertainty decomposition provides a robust framework for uncertainty estimation in PCs, grounded in the multilinear polynomial structure of PC outputs.

Demerits

Complexity of SymFormer

The implementation of SymFormer as a grammar-constrained autoregressive Transformer with tree-relative self-attention may introduce complexity and computational overhead.

Limited Exploration of RL Techniques

The article primarily focuses on entropy-regularized RL, and it is unclear whether other RL techniques may be equally effective or more efficient for PC structure inference.

Expert Commentary

The article proposes a novel approach to PC structure inference using entropy-regularized RL, which has the potential to overcome the limitations of greedy algorithms. The implementation of SymFormer as a grammar-constrained autoregressive Transformer with tree-relative self-attention is a significant innovation, guaranteeing valid circuits at every generation step. However, the complexity of SymFormer and the limited exploration of RL techniques may be potential drawbacks. Nonetheless, the proposed method demonstrates promising scalability and achieves significant improvements in SNR and sample efficiency. As the field of PC-based modeling continues to evolve, SymCircuit and SymFormer are likely to play a crucial role in advancing the state-of-the-art.

Recommendations

  • Recommendation 1: Future research should explore the application of SymCircuit and SymFormer to various PC-based modeling tasks, such as computer vision and natural language processing.
  • Recommendation 2: The development of SymCircuit and SymFormer may benefit from the incorporation of other RL techniques, such as policy gradient methods or actor-critic algorithms, to further improve the efficiency and scalability of PC structure inference.

Sources

Original: arXiv - cs.LG