Academic

Domain-informed explainable boosting machines for trustworthy lateral spread predictions

arXiv:2603.17175v1 Announce Type: new Abstract: Explainable Boosting Machines (EBMs) provide transparent predictions through additive shape functions, enabling direct inspection of feature contributions. However, EBMs can learn non-physical relationships that reduce their reliability in natural hazard applications. This study presents a domain-informed framework to improve the physical consistency of EBMs for lateral spreading prediction. Our approach modifies learned shape functions based on domain knowledge. These modifications correct non-physical behavior while maintaining data-driven patterns. We apply the method to the 2011 Christchurch earthquake dataset and correct non-physical trends observed in the original EBM. The resulting model produces more physically consistent global and local explanations, with an acceptable tradeoff in accuracy (4--5\%).

C
Cheng-Hsi Hsiao, Krishna Kumar, Ellen M. Rathje
· · 1 min read · 6 views

arXiv:2603.17175v1 Announce Type: new Abstract: Explainable Boosting Machines (EBMs) provide transparent predictions through additive shape functions, enabling direct inspection of feature contributions. However, EBMs can learn non-physical relationships that reduce their reliability in natural hazard applications. This study presents a domain-informed framework to improve the physical consistency of EBMs for lateral spreading prediction. Our approach modifies learned shape functions based on domain knowledge. These modifications correct non-physical behavior while maintaining data-driven patterns. We apply the method to the 2011 Christchurch earthquake dataset and correct non-physical trends observed in the original EBM. The resulting model produces more physically consistent global and local explanations, with an acceptable tradeoff in accuracy (4--5\%).

Executive Summary

This article presents a domain-informed framework to improve the physical consistency of Explainable Boosting Machines (EBMs) for lateral spreading prediction in natural hazard applications. By modifying learned shape functions based on domain knowledge, the proposed method corrects non-physical behavior while maintaining data-driven patterns. The approach is demonstrated on the 2011 Christchurch earthquake dataset, resulting in more physically consistent global and local explanations with an acceptable tradeoff in accuracy. The study contributes to the development of trustworthy predictive models in high-stakes domains.

Key Points

  • The article proposes a domain-informed framework to enhance the physical consistency of EBMs in natural hazard applications.
  • The approach modifies learned shape functions based on domain knowledge to correct non-physical behavior.
  • The method is demonstrated on the 2011 Christchurch earthquake dataset and achieves acceptable accuracy tradeoffs.

Merits

Strength in Developing Trustworthy Predictive Models

The proposed framework addresses a critical limitation of EBMs in natural hazard applications, enabling the development of more trustworthy predictive models.

Improvement in Physical Consistency

The approach effectively corrects non-physical behavior in EBMs while maintaining data-driven patterns, leading to more physically consistent global and local explanations.

Demerits

Potential Over-Reliance on Domain Knowledge

The approach may over-rely on domain knowledge, potentially limiting its applicability to domains with limited or uncertain domain expertise.

Tradeoff between Accuracy and Physical Consistency

The method may require a tradeoff between accuracy and physical consistency, which may not be acceptable in all applications.

Expert Commentary

The proposed framework is a significant contribution to the development of trustworthy predictive models in natural hazard applications. By modifying learned shape functions based on domain knowledge, the approach effectively corrects non-physical behavior in EBMs while maintaining data-driven patterns. However, the potential over-reliance on domain knowledge and the tradeoff between accuracy and physical consistency are limitations that need to be addressed. The study's implications for natural hazard risk assessment and mitigation are substantial, and policy-makers should prioritize the development of more interpretable and explainable machine learning models.

Recommendations

  • Future research should investigate the applicability of the proposed framework to other machine learning models and domains.
  • Developers should prioritize the incorporation of domain knowledge and interpretability techniques into machine learning models to enhance their trustworthiness in high-stakes applications.

Sources