Wearables, Remote Patient Monitoring and Artificial Intelligence: The 2026 Evidence Map
Wearables detect atrial fibrillation with high accuracy and remote monitoring reduces mortality in heart failure, yet screening has not yet been shown to prevent stroke.
Smartwatches, smart rings and wearable sensors are no longer a technological curiosity; they have become clinical instruments that quietly track heart rhythm, oxygenation, sleep and, increasingly, blood pressure on the wrists of millions. At the core of these devices are artificial-intelligence algorithms that translate raw signals — photoplethysmography (PPG) and single-lead ECG — into actionable alerts. The years 2025 and 2026 marked a turning point in which observational promises were, for the first time, replaced by robust randomized controlled trials (RCTs). This article aims to separate, without exaggeration, what has been proven from what has not yet been proven.
Atrial Fibrillation Screening: New Randomized Evidence
Atrial fibrillation (AF) is one of the leading preventable causes of stroke and often goes undiagnosed because it is frequently paroxysmal and asymptomatic. For years, the reference point in this field was the 2019 observational Apple Heart Study — but its single-arm design and low confirmation rate (notifications in only about 0.5% of participants) mean it can no longer serve as the backbone of current evidence.
The most important update of 2026 is the EQUAL randomized trial (van Steijn et al., JACC). According to the full abstract retrieved from PubMed, 437 patients aged 65 and older with elevated stroke risk (CHA₂DS₂-VASc ≥2 men / ≥3 women; median age 75, 46.7% women) were enrolled at two centers in the Netherlands and randomized to six months of Apple Watch monitoring (PPG plus single-lead ECG) or standard care. New-onset AF occurred in 9.6% (21/219) of the intervention group versus 2.3% (5/218) of controls (risk difference 7.3 percentage points; 95% CI 2.9-11.7; p=0.001; HR 4.40; 95% CI 1.66-11.66). Notably, a meaningful share of detected episodes were asymptomatic and captured only in the intervention arm. The accompanying editorial (Cheng & Chao, JACC) confirms that the trial increases detection but stresses that the clinical benefit of treating AF found this way must be demonstrated by ongoing outcome trials.
The second large RCT, GUARD-AF (ESC 2024), screened 11,905 participants across 149 U.S. primary-care sites with 14-day patch ECG. AF diagnosis was 5% in the screening arm versus 3.3% with standard care (roughly a 52% relative increase), and anticoagulant initiation rose. However, there was no significant difference in stroke-related hospitalization (0.7% vs 0.6%); the trial was stopped early because of COVID, and the authors concluded that larger, longer studies are needed before routine screening can be recommended.
The Backbone Message: Detection ↑, Outcomes Not Yet ≠
The two large randomized trials of 2024-2026 say the same thing consistently: smartwatch screening markedly increases AF detection, but has not yet been shown to reduce hard clinical endpoints such as stroke or death. Equating more detection with better outcomes would push beyond what today's evidence supports.
Diagnostic Accuracy: Excellent in the Lab, Nuanced in the Field
In selected populations the diagnostic performance of smartwatches really is high. A current systematic review and diagnostic meta-analysis retrieved from PubMed (Barrera et al., JACC Advances 2025; PROSPERO-registered, 26 studies, 17,349 patients) reports a pooled sensitivity of 95% (95% CI 92-97), specificity of 97% (94-98) and AUC of 0.97. A separate meta-analysis specific to the Apple Watch ECG (Shahid et al., JACC Advances 2025; QUADAS-2, 11 studies, 4,241 participants) similarly yields sensitivity 94.8%, specificity 95.0%, AUC 0.96. Another comparative review (Sibomana et al., BMC Cardiovasc Disord 2025) shows that both chest ECG patches and PPG watches are excellent and clinically comparable.
Here lies a critical nuance that must be stated honestly: high specificity does not mean high positive predictive value (PPV). In a low-prevalence, asymptomatic population over 65, the base-rate (Bayesian) effect takes over. In EQUAL, the real-world PPV of records first labeled AF was only around 54%; in lower-prevalence community screening, modeled PPV can fall to 20-40% — meaning false positives may outnumber true positives. The 97% specificity under laboratory conditions is impressive, but when deciding on widespread screening, it is the field PPV that is decisive.
Conflicting Evidence: PPG or ECG?
Which sensor is superior remains debated, and the sources should be presented side by side. Most meta-analyses (Barrera 2025; Sibomana 2025) find the two modalities clinically comparable. Some individual analyses have found PPG more sensitive than ECG — but this difference usually reflects the high variability of smartwatch ECGs; that is, it points less to an absolute superiority than to a lack of device and algorithm standardization. The lesson is not a modality war but a need for standardization.
Regulatory Clearances: The 2024-2026 Wave
The regulatory landscape has shifted rapidly. As confirmed by web search, the Apple Watch Hypertension Notification feature received FDA clearance on 11 September 2025; using the optical heart sensor and machine learning, it flags signs of chronic high blood pressure (Series 9 and later, Ultra 2 and later). Importantly, this is not a diagnosis but a "talk to your doctor" alert. Earlier, the Apple Watch AF History feature was qualified within the FDA's Medical Device Development Tools program (2024), and the Sleep Apnea Notification received FDA clearance (2024). In an independent validation study (Vandenberk et al., npj Digital Medicine 2025), FibriCheck — based on smartphone-camera PPG — showed sensitivity 96.3%, specificity 99.3%, PPV 98.0%, NPV 99.8% against 12-lead ECG in 236 participants across five academic centers.
Remote Patient Monitoring: Hard-Outcome Evidence in Heart Failure
The strongest clinical evidence comes not from AF screening but from remote patient monitoring (RPM). A comprehensive meta-analysis in heart failure (De Lathauwer et al., Eur J Heart Fail 2025; 41 RCTs, 16,312 patients) reports that RPM reduces all-cause mortality (OR 0.81; 95% CI 0.69-0.95) — about 19% and heart-failure hospitalization (OR 0.78; 0.70-0.87) — about 22%. The most effective components were self-management and education modules and video consultation; a trial sequential analysis confirmed the mortality benefit is robust. Caveat: the trial population (around 67-68 years) is younger than the typical hospitalized heart-failure patient, and heterogeneity is high.
Continuous Glucose Monitoring: Modest but Real
For non-insulin-treated type 2 diabetes, a meta-analysis of continuous glucose monitoring (CGM) (Jancev/Ferreira et al., JCEM 2024; 14 RCTs, 1,647 participants) found an improvement in HbA1c of −0.32% (95% CI −0.41 to −0.23) compared with fingerstick monitoring. This is significant but modest. Aronson's critical review (Diabetes Obes Metab 2025) questions larger-effect claims on methodological grounds and stresses that evidence for durability is limited. "Revolution" language is therefore overstated; the accurate framing is "a useful tool in selected patients."
| Domain | Evidence type | Key finding | Hard-outcome evidence? |
|---|---|---|---|
| AF screening (smartwatch) | RCT (EQUAL, GUARD-AF) | Detection ↑ (HR 4.40; +52%) | No — no stroke difference |
| AF diagnostic accuracy | SR/MA (26 studies) | Sensitivity 95%, AUC 0.97 | High in lab, field PPV ~54% |
| Remote monitoring (heart failure) | SR/MA/TSA (41 RCTs) | Death OR 0.81; admission OR 0.78 | Yes — mortality benefit robust |
| CGM (non-insulin T2DM) | SR/MA (14 RCTs) | HbA1c −0.32% | Modest, durability debated |
Equity and Bias: The Skin-Tone Problem
PPG is an optical technology and can be affected by skin pigmentation. A systematic review and meta-analysis (Al-Halawani et al., J Med Internet Res 2024) showed that in pulse oximetry the SpO₂ bias is larger in darker skin (1.27% vs 0.70% in lighter skin) and that occult hypoxemia may be missed more often in darker skin. By contrast, the same analysis found no clinically meaningful skin-tone bias in wearable heart-rate estimation. So the problem is critical mainly for oxygenation measurement; for rhythm and pulse it is less pronounced but not absent — FibriCheck data confirm a decline in signal quality in darker skin (correctable with human verification).
Artificial Intelligence, False Positives and Clinical Workload
As AI labels more data, the "noise" arriving at the clinician's desk also grows. An analysis presented at EHRA 2026 showed that even in AI-equipped implantable cardiac monitors, 32.9% of episodes were non-actionable and 30.6% indeterminate. On the positive side, a patient-led smartwatch RCT after ablation (Ahluwalia et al., JACC Advances 2026; 168 patients) showed that with a structured workflow only 1.9% of the hundreds of ECGs recorded over 12 months were forwarded for review, keeping the workload manageable — though about one-third of ECGs could not be classified. Two real risks stand out: automation bias (over-reliance on the AI label) and distribution shift (performance decline outside the population the device was trained on). Both must be taken seriously in clinical-use decisions, not just product brochures.
Conclusion
Wearables and AI have genuinely transformed cardiovascular monitoring — but the reading must be balanced. What is proven: these devices detect AF with high accuracy in selected populations (sensitivity ~95%, AUC ~0.97); smartwatch screening markedly increases new AF detection; remote monitoring reduces death and hospitalization in heart failure (41 RCTs, OR 0.81/0.78); and CGM provides a modest benefit in selected type 2 diabetes. What is not yet proven: AF screening has not been shown to reduce stroke or death; real-world PPV is low (~54%), bringing the risk of false-positive burden, anxiety, unnecessary testing and cost in widespread screening; hypertension and sleep-apnea notifications are not diagnostic; and external-validation gaps, distribution shift, skin-tone signal quality and automation bias remain unresolved. In the coming period, large outcome trials such as HEARTLINE will help close the gap between "we detected it" and "we improved the outcome." Until then, the right stance is not to reject the technology but to separate its promise from its evidence with rigor.
References
- van Steijn NJ, et al. Enhanced Detection and Prompt Diagnosis of Atrial Fibrillation Using Apple Watch (EQUAL Trial). J Am Coll Cardiol. 2026. link
- Cheng W, Chao TF. Smartwatch-Integrated AF Detection: What the EQUAL Trial Tells Us (editorial). J Am Coll Cardiol. 2026. link
- GUARD-AF: Reducing Stroke by Screening for Undiagnosed AF in the Elderly. ESC Congress 2024. 2024. link
- Barrera N, et al. Accuracy of Smartwatches in the Detection of Atrial Fibrillation: SR & Diagnostic Meta-Analysis. JACC Advances. 2025. link
- Shahid S, et al. Diagnostic Accuracy of Apple Watch ECG for Atrial Fibrillation: SR & Meta-Analysis. JACC Advances. 2025. link
- Sibomana O, et al. Diagnostic Accuracy of ECG Smart Chest Patches vs PPG Smartwatches for AF: SR & MA. BMC Cardiovasc Disord. 2025. link
- Apple. FDA clears Apple Watch hypertension notification feature. Diagnostics World / FDA. 2025. link
- Vandenberk B, et al. FibriCheck Detection Capabilities for AF (FDA-AF): Multicenter Validation. npj Digital Medicine. 2025. link
- De Lathauwer ILGB, et al. Remote Patient Monitoring in Heart Failure: Comprehensive Meta-Analysis. Eur J Heart Fail. 2025. link
- Jancev M, Ferreira JP, et al. Continuous Glucose Monitoring in Type 2 Diabetes: SR & MA of RCTs. J Clin Endocrinol Metab. 2024. link
- Aronson R, et al. CGM in Non-Insulin-Treated Type 2 Diabetes: Critical Review + Updated MA. Diabetes Obes Metab. 2025. link
- Al-Halawani R, et al. Impact of Skin Pigmentation on Pulse Oximetry & Wearable Pulse Rate Accuracy: SR & MA. J Med Internet Res. 2024. link
- Ahluwalia N, et al. Patient-Led Smartwatch ECG Follow-Up After AF Ablation (RCT). JACC Advances. 2026. link