Academic

On the Reliability of AI Methods in Drug Discovery: Evaluation of Boltz-2 for Structure and Binding Affinity Prediction

Shunzhou Wan, Xibei Zhang, Xiao Xue, Peter V. Coveney · March 9, 2026 · 1 min read · 10 views

#physics.chem-ph #cs.AI

arXiv:2603.05532v1 Announce Type: cross Abstract: Despite continuing hype about the role of AI in drug discovery, no "AI-discovered drugs" have so far received regulatory approval. Here we assess one of the latest AI based tools in this domain. The ability to rapidly predict protein-ligand structures and binding affinities is pivotal for accelerating drug discovery. Boltz-2, a recently developed biomolecular foundation model, aims to bridge the gap between AI efficiency and physics-based precision through a joint "co-folding" approach. In this study, we provide an extensive evaluation of Boltz-2 using two large-scale datasets: 16,780 compounds for 3CLPro and 21,702 compounds for TNKS2. We compare Boltz-2 predicted structures with traditional docking and binding affinities with binding free energies derived from the physics-based ESMACS protocol. Structural analysis reveals significant global RMSD variations, indicating that Boltz-2 predicts multiple protein conformations and ligand binding positions rather than a single converged pose. Energetic evaluations exhibit only weak to moderate correlations across the global datasets. Furthermore, a focused analysis of the top 100 compounds yields no significant correlation between the Boltz-2 predictions and the binding free energies from fine-grained ESMACS, alongside observed saturation difference in ligand structures. Our results show that while Boltz-2 offers substantial speed for initial screening, it lacks the energetic resolution required for lead identification. These findings highlight the necessity of employing physics-based methods for the reliability and refinement of AI-derived models.

Executive Summary

The article evaluates the reliability of Boltz-2, a biomolecular foundation model, in predicting protein-ligand structures and binding affinities for drug discovery. The study finds that while Boltz-2 offers speed for initial screening, it lacks the energetic resolution required for lead identification, highlighting the need for physics-based methods to refine AI-derived models. The results show significant global RMSD variations and weak correlations between Boltz-2 predictions and binding free energies, indicating limitations in the model's accuracy.

Key Points

▸ Boltz-2 predicts multiple protein conformations and ligand binding positions rather than a single converged pose
▸ Energetic evaluations exhibit only weak to moderate correlations across the global datasets
▸ The model lacks the energetic resolution required for lead identification

Merits

Speed and Efficiency

Boltz-2 offers substantial speed for initial screening, making it a valuable tool for rapid evaluation of large compound libraries

Demerits

Limited Energetic Resolution

Boltz-2's inability to accurately predict binding free energies and its reliance on physics-based methods for refinement limit its usefulness in lead identification

Expert Commentary

The article's evaluation of Boltz-2 highlights the ongoing challenges in developing reliable AI models for drug discovery. While AI models can accelerate initial screening, they often lack the precision and accuracy required for lead identification. The study's findings emphasize the importance of combining AI models with physics-based methods to improve the accuracy and reliability of drug discovery. Furthermore, the article's results underscore the need for regulatory frameworks that address the unique challenges and opportunities presented by AI-driven drug discovery.

Recommendations

✓ Develop hybrid approaches combining AI models with physics-based methods to improve the accuracy and reliability of drug discovery
✓ Establish regulatory frameworks to address the challenges and opportunities presented by AI-driven drug discovery

Sources

arXiv - cs.AI

On the Reliability of AI Methods in Drug Discovery: Evaluation of Boltz-2 for Structure and Binding Affinity Prediction

AI Commentary

Executive Summary

Key Points

Merits

Speed and Efficiency

Demerits

Limited Energetic Resolution

Expert Commentary

Recommendations

Sources

Related Articles

AI-Driven Approaches to Enhancing Fairness and Identifying Algorithmic Bias in …

High resolution schemes for hyperbolic conservation laws

Robust Graph Representation Learning via Adaptive Spectral Contrast

Towards Intrinsically Calibrated Uncertainty Quantification in Industrial Data-Driven Models via …

JCG, PC

HSOLLC Co., Ltd.