Artificial Intelligence in Preeclampsia and Preterm Birth Prediction: A 2026 Evidence Appraisal

Creator: Cem Akaltun, MD
Published: 2026-05-26

Artificial intelligence and angiogenic biomarkers have produced tangible advances in preeclampsia prediction; yet high internal performance has not yet translated into external generalisability or improved patient outcomes.

By Cem Akaltun, MD · June 9, 2026Updated · ~12 min read Preeclampsia Preterm Birth Risk Prediction

Preeclampsia and preterm birth remain leading contributors to maternal and neonatal morbidity and mortality. Early and accurate prediction of both conditions is critical for the timely initiation of preventive treatment (such as aspirin prophylaxis in preeclampsia) and for appropriate planning of the place and timing of delivery. In recent years, artificial intelligence (AI) and machine learning (ML) models, together with angiogenic biomarker tests, have advanced rapidly in this field. However, it is essential to distinguish what the technology has actually achieved from what it has not yet achieved. This article offers a measured appraisal of the most current 2025-2026 evidence, presenting conflicting findings side by side.

A regulatory turning point: sFlt-1/PlGF tests now in clinical use in the US

For many years, no FDA-cleared blood biomarker was available for preeclampsia risk assessment. This changed in 2023. The Thermo Fisher B·R·A·H·M·S sFlt-1/PlGF KRYPTOR test received FDA De Novo clearance with breakthrough device designation on 19 May 2023 (DEN220027), becoming the first FDA-cleared blood test in the US to assess the risk of progression to preeclampsia with severe features. The test was designed for hospitalised women with singleton pregnancies between 23 and 34+6 weeks of gestation who have hypertension. Labcorp launched the same KRYPTOR-based test for clinical service on 31 January 2024. Subsequently, the Roche Elecsys sFlt-1/PlGF test received FDA 510(k) clearance on 13 February 2025, intended to stratify hospitalised pregnant women with hypertensive disorders into low or high risk of developing severe preeclampsia within two weeks.

The basis for these clearances, the PRAECIS study (18 US hospitals, more than 700 pregnant women), showed that an sFlt-1/PlGF ratio cut-off of ≥ 40 predicted progression to preeclampsia with severe features within two weeks with 94% sensitivity, 75% specificity, 65% PPV and 96% NPV. In women with a ratio below 40, the probability of severe preeclampsia within two weeks was under 5%. The fact that 30% of the cohort was Black and 16% Hispanic strengthens the study from a diversity standpoint. The true clinical value of this test lies in its ability to rule out severe disease in the short term with a high negative predictive value — it is, in essence, a triage tool.

Predictive accuracy ≠ improved patient outcome

Although the sFlt-1/PlGF test predicts short-term severe preeclampsia with high accuracy, its clinical utility is debated. The multicentre PRECOG randomised controlled trial found that using the test did not reduce hospitalisation duration in suspected preeclampsia. In other words, better prediction does not automatically translate into a better management outcome.

First-trimester screening and aspirin: the most robust evidence base

The best-validated approach to preeclampsia prediction is the first-trimester competing-risks model developed by the Fetal Medicine Foundation (FMF). This model combines maternal factors, mean arterial pressure (MAP), uterine artery pulsatility index (UtA-PI) and placental growth factor (PlGF). Its fundamental importance was established by the ASPRE trial: among women selected as high-risk at 11-13+6 weeks, 150 mg of aspirin daily reduced the incidence of preterm preeclampsia to 1.6% (versus 4.3% with placebo) — an approximately 62% relative reduction. This is the cornerstone evidence underpinning the "first-trimester screening plus aspirin" paradigm.

External validations of the FMF model are ongoing. In a Brazilian external validation, preterm preeclampsia occurred in 3.1% of cases; at a 1/100 cut-off, the detection rate was 71.4% and the area under the curve (AUC) was 0.818 (95% CI 0.773-0.863). Validations in US nulliparous populations continued through 2025. This level of performance (AUC ~0.80-0.82) may appear modest compared with single-centre high claims, but the model's consistency across different populations is far more valuable for clinical reliability.

Machine learning models: high internal performance, weak external transfer

The most critical reality in the ML literature is the chasm between internal performance and external generalisability. A current meta-analysis published in the Journal of Medical Internet Research in January 2026, covering 26 studies and 31 ML models, laid this problem bare. Although the pooled AUC of 0.91 (95% CI 0.87-0.92) appears impressive, there was extreme heterogeneity (I² > 99%). More importantly, only 6 studies performed external validation, and in these the pooled sensitivity dropped to 0.68 (prediction interval 0.25-0.94). The authors' conclusion is striking: the high AUC reflects "performance on the internal development dataset" rather than universal effectiveness. When assessed with the PROBAST tool, most studies carried an unclear or high risk of bias.

By contrast, dynamic short-term prediction using routine electronic health record (EHR) data offered a promising direction. In a study published in JAMA Network Open in March 2026, 58,839 pregnancies from three NewYork-Presbyterian hospitals were analysed. Using XGBoost with only routine data (blood pressure, laboratory values, demographics), the onset of preeclampsia within 1/2/4 weeks was predicted. Performance peaked at 34 weeks: AUC 0.863 in training and 0.808-0.834 in external validation. Because of the low prevalence of preeclampsia, PPV remained low at 90% sensitivity (approximately 0.02-0.06), but the NPV was ≥ 0.993. The significance of this study lies in its shift from static single-timepoint risk to rolling-window dynamic risk updating, providing lead-time aligned with clinical decision timing for late-onset and term preeclampsia. No significant fairness concern was detected across demographic subgroups.

Algorithmic bias and the risk of widening inequity

The least discussed yet most important risk of AI models is their potential to amplify inequity. The eclampsia prediction study published in AJOG Global Reports in April 2026, covering 3.6 million births using CDC data, brought this into sharp focus. ML models (logistic regression, random forest, LightGBM, XGBoost) achieved only an AUC of 0.64, whereas the ACOG checklist provided 81.4% detection but produced 68.9% false positives. More concerning still, the AUC was lower among Black, American Indian, foreign-born and low-income individuals. This finding clearly demonstrates that deploying models into clinical use without population-specific external validation could deepen existing inequities.

Approach / Study	Performance (AUC / metric)	Context
sFlt-1/PlGF ≥ 40 (PRAECIS)	Sensitivity 94% / NPV 96%	Severe PE within 2 weeks, FDA basis
FMF first-trimester (Brazil validation)	AUC 0.818; detection 71.4%	Preterm PE, external validation
ML meta-analysis (JMIR 2026)	Internal AUC 0.91 → external sensitivity 0.68	26 studies, I² > 99%
Dynamic EHR (JAMA NO 2026)	AUC 0.808-0.834 (external)	Peak at 34 weeks, NPV ≥ 0.993
CDC eclampsia (AJOG GR 2026)	AUC 0.64	3.6 M births, lower in subgroups

Preterm birth prediction: a stable burden, a persistent challenge

The burden of preterm birth is not declining. According to global estimates published in The Lancet in 2023, 13.4 million preterm births (9.9%) occurred in 2020, with no measurable decline recorded in any region over the preceding decade. Prediction models are exploring various approaches: a logistic regression model developed from an early-pregnancy serum protein panel (proteomics) reported an AUC of 0.868 in an independent set, while LSTM-based deep learning approaches that model multi-timepoint cervical length and time-series data have yielded the best results.

However, the most instructive finding in this area is a cautionary one: a gene expression model developed for spontaneous preterm birth achieved an AUC of 0.99 in training but fell to 0.54 in the test set and to 0.50 — chance level — in external validation. This is a concrete example of why the external validity of single-centre high performance must be questioned.

Guidelines and an honest framing of the limits

On the guideline side, NICE published DG49 in 2024 (an update of DG23), recommending four PlGF-based tests (DELFIA Xpress, Elecsys, Triage PLGF) for "rule-in/rule-out" in suspected preterm preeclampsia. In terms of cut-offs, alongside the "ratio 38 rule-out" framework (PROGNOSIS) widely used in Europe, the ≥ 40 cut-off in the US clearance context (PRAECIS, severe PE within two weeks) has entered clinical practice.

An honest appraisal requires separating the proven from the unproven. What has been proven: first-trimester FMF screening plus aspirin reduces preterm preeclampsia; sFlt-1/PlGF tests rule out short-term severe preeclampsia with high NPV; dynamic short-term prediction in late pregnancy using routine EHR data is technically feasible. What remains unproven or uncertain: the external generalisability of ML models is weak (sensitivity dropping to 0.68, some models at chance level); the translation of predictive accuracy into clinical benefit (hard outcomes) is uncertain (PRECOG did not reduce hospitalisation duration); prediction of late-onset and term preeclampsia is weak (AUC ~0.6-0.7), even though these constitute the majority of cases and aspirin has limited effect on them; and because of low prevalence, PPV at 90% sensitivity is often below 10%, creating a risk of over-screening and unnecessary intervention.

Conclusion

Artificial intelligence and angiogenic biomarkers have made tangible progress in preeclampsia prediction over the past three years: FDA-cleared sFlt-1/PlGF tests have entered clinical use in the US, the first-trimester screening plus aspirin paradigm is supported by robust evidence, and dynamic prediction using routine data has become technically feasible. Yet the field deserves a realistic assessment of its maturity. The high internal performance seen in individual studies (AUC ~0.9) is often not preserved in external settings; predictive accuracy does not necessarily translate into a better patient outcome; and models risk amplifying inequity by performing worse in the most vulnerable population groups. In clinical practice, these tools — particularly high-NPV biomarker tests and first-trimester screening — are valuable decision-support elements; but without prospective multicentre validation, TRIPOD-AI-compliant reporting and benefit demonstrated through hard clinical outcomes, they should stand alongside clinical judgement rather than replace it. The true contribution of artificial intelligence is to support the clinician's judgement, not to supplant it.

References

Liu et al. Machine Learning Prediction Models for Preeclampsia: Systematic Review and Meta-Analysis. J Med Internet Res. 2026. site
Li et al. Machine Learning for Dynamic and Short-Term Prediction of Preeclampsia Using Routine Clinical Data. JAMA Network Open. 2026. site
Roche Diagnostics. Roche Elecsys sFlt-1/PlGF ratio for preeclampsia receives FDA 510(k) clearance. 2025. site
Thermo Fisher / Preeclampsia Foundation. FDA Clearance of B·R·A·H·M·S sFlt-1/PlGF KRYPTOR Immunoassays (PRAECIS performance). 2023-2024. site
Hackelöer et al. Internal and external validation of a machine learning algorithm to detect preeclampsia-related adverse outcomes in high-risk pregnancies. Pregnancy Hypertension. 2026. site
Minsart et al. Eclampsia risk prediction across diverse U.S. populations: machine learning versus ACOG checklists. AJOG Global Reports. 2026. site
Rios-Garcia et al. Prediction of Early-onset Preeclampsia Using Deep Learning: A Scoping Review. Pregnancy Hypertension. 2026. site
PRECOG Study. sFlt-1/PlGF ratio use does not reduce hospitalisation duration in suspected preeclampsia: a multicentre randomised trial. Scientific Reports. 2025. site
NICE. PLGF-based testing to help diagnose suspected preterm pre-eclampsia (DG49). 2024. site
FMF external validation (Brazil). Performance of the first-trimester competing risks model for preeclampsia prediction. AJOG Global Reports. 2024. site
Rethinking Risk Prediction in Preeclampsia: From Biomarkers to Mechanistic Phenotypes and Longitudinal Models. Int J Mol Sci. 2026. site
Ohuma et al. National, regional, and global estimates of preterm birth in 2020. The Lancet. 2023. site

Disclaimer: This content is for general informational and educational purposes only and does not substitute for medical advice, diagnosis, or treatment. The risk-prediction models mentioned are for research and/or decision-support purposes; diagnosis, prophylaxis, and treatment decisions must be made through the physician's evaluation. Consult your physician for decisions about your pregnancy.