AEGIS: An Operational Infrastructure for Post-Market Governance of Adaptive Medical AI Under US and EU Regulations
arXiv:2603.22322v1 Announce Type: new Abstract: Machine learning systems deployed in medical devices require governance frameworks that ensure safety while enabling continuous improvement. Regulatory bodies including the FDA and European Union have introduced mechanisms such as the Predetermined Change Control Plan (PCCP) and Post-Market Surveillance (PMS) to manage iterative model updates without repeated submissions. This paper presents AI/ML Evaluation and Governance Infrastructure for Safety (AEGIS), a governance framework applicable to any healthcare AI system. AEGIS comprises three modules, i.e., dataset assimilation and retraining, model monitoring, and conditional decision, that operationalize FDA PCCP and EU AI Act Article 43(4) provisions. We implement a four-category deployment decision taxonomy (APPROVE, CONDITIONAL APPROVAL, CLINICAL REVIEW, REJECT) with an independent PMS ALARM signal, enabling detection of the critical state in which no deployable model exists while the
arXiv:2603.22322v1 Announce Type: new Abstract: Machine learning systems deployed in medical devices require governance frameworks that ensure safety while enabling continuous improvement. Regulatory bodies including the FDA and European Union have introduced mechanisms such as the Predetermined Change Control Plan (PCCP) and Post-Market Surveillance (PMS) to manage iterative model updates without repeated submissions. This paper presents AI/ML Evaluation and Governance Infrastructure for Safety (AEGIS), a governance framework applicable to any healthcare AI system. AEGIS comprises three modules, i.e., dataset assimilation and retraining, model monitoring, and conditional decision, that operationalize FDA PCCP and EU AI Act Article 43(4) provisions. We implement a four-category deployment decision taxonomy (APPROVE, CONDITIONAL APPROVAL, CLINICAL REVIEW, REJECT) with an independent PMS ALARM signal, enabling detection of the critical state in which no deployable model exists while the released model is simultaneously at risk. To illustrate how AEGIS can be instantiated across heterogeneous clinical contexts, we provide two examples: sepsis prediction from electronic health records and brain tumor segmentation from medical imaging. Both cases use identical governance architecture, differing only in configuration. Across 11 simulated iterations on the sepsis example, AEGIS yielded 8 APPROVE, 1 CONDITIONAL APPROVAL, 1 CLINICAL REVIEW, and 1 REJECT decision, exercising all four categories. ALARM signals were co-issued at iterations 8 and 10, including the critical state where no deployable model exists and the released model is simultaneously failing. AEGIS detected drift before observable performance degradation. These results demonstrate that AEGIS translates regulatory change-control concepts into executable governance procedures, supporting safe continuous learning for adaptive medical AI across diverse clinical applications.
Executive Summary
This study presents AEGIS, a governance framework for adaptive medical AI systems under US and EU regulations. AEGIS operationalizes Predetermined Change Control Plan (PCCP) and Post-Market Surveillance (PMS) mechanisms, enabling safe continuous learning. The framework comprises three modules: dataset assimilation and retraining, model monitoring, and conditional decision. AEGIS is demonstrated across two heterogeneous clinical contexts: sepsis prediction and brain tumor segmentation. The results show AEGIS's efficacy in translating regulatory concepts into executable procedures, supporting safe AI deployment. However, the study requires replication and validation in real-world settings to ensure its scalability and applicability.
Key Points
- ▸ AEGIS is a governance framework for adaptive medical AI systems under US and EU regulations.
- ▸ AEGIS operationalizes Predetermined Change Control Plan (PCCP) and Post-Market Surveillance (PMS) mechanisms.
- ▸ AEGIS comprises three modules: dataset assimilation and retraining, model monitoring, and conditional decision.
Merits
Transparency and Replicability
The study provides a clear and detailed description of the AEGIS framework, allowing for easy replication and validation.
Applicability to Real-World Settings
AEGIS demonstrates its efficacy across two heterogeneous clinical contexts, suggesting its potential for widespread application.
Demerits
Limited Scalability
The study's reliance on simulated iterations may not accurately reflect the complexities of real-world healthcare settings.
Lack of Human Factors Consideration
The study focuses primarily on technical aspects, neglecting the importance of human factors in AI system deployment.
Expert Commentary
AEGIS presents a promising approach to addressing the governance challenges associated with adaptive medical AI systems. However, its applicability and scalability in real-world settings require further investigation. The study's reliance on simulated iterations and lack of human factors consideration may limit its generalizability. Nonetheless, the framework's ability to operationalize regulatory mechanisms and detect critical states makes it a valuable contribution to the ongoing debate around AI regulation and governance.
Recommendations
- ✓ Future studies should focus on replicating and validating AEGIS in real-world settings, incorporating human factors considerations and evaluating its scalability.
- ✓ Regulatory bodies and policymakers should consider incorporating AEGIS-like frameworks into their regulatory mechanisms, prioritizing ongoing surveillance and governance of adaptive medical AI systems.
Sources
Original: arXiv - cs.LG