Skip to main navigation Skip to search Skip to main content

A machine learning approach to identify patients at risk for long-term consequences after pulmonary embolism

  • Stephan Nopp
  • , Clemens Spielvogel
  • , Behnood Bikdeli
  • , Ana Alberich-Conesa
  • , Luis Hernández-Blasco
  • , Mª Luisa Peris
  • , Remedios Otero
  • , David Jiménez
  • , Manuel Monreal
  • , Cihan Ay (Corresponding Author)
  • , RIETE Investigators
  • , Andris Skride (Member of the Working Group)

Research output: Contribution to journalArticlepeer-review

2 Citations (Scopus)

Abstract

Pulmonary embolism (PE) can result in long-term sequelae, such as post-PE syndrome, including persistent dyspnea and chronic thromboembolic pulmonary hypertension (CTEPH). Existing prediction tools for severe post-PE complications lack sensitivity and specificity. This study aimed to develop a machine learning model to identify patients at risk for long-term consequences after PE. Using data from the RIETE registry, the largest prospective international PE registry, we developed supervised machine learning models to identify patients at increased risk of CTEPH and post-PE syndrome. Our approach involved data preprocessing, model training via random forest algorithm, and validation through Monte-Carlo cross-validation. The performance of the CTEPH prediction model was benchmarked against an existing score. Of the 57,981 PE patients in the RIETE registry, 5,217 were eligible for inclusion. Median age was 68 years, with 50.6% men. Machine learning was based on 111 predictor variables, with 171 patients (3.3%) developing CTEPH. The CTEPH model demonstrated good performance with an AUC of 0.74 (95%CI: 0.73-0.75), significantly outperforming the existing CTEPH prediction score (0.57; 0.54-0.61). Additionally, 1,310 (25.1%) patients were defined as having post-PE syndrome six months after index PE. The post-PE syndrome model showed poorer performance with an AUC of 0.62 (0.61-0.62). Key predictor variables across both models included chest pain at presentation, PE location, troponin, side of clot, and dyspnea at presentation. Machine learning models show promise in predicting CTEPH but are less effective for post-PE syndrome. Future refinement, including integrating imaging data, is necessary to improve predictive performance and clinical utility.

Original languageEnglish
Article number32744
JournalScientific Reports
Volume15
Issue number1
DOIs
Publication statusPublished - 24 Sept 2025
Externally publishedYes

Keywords*

  • Humans
  • Pulmonary Embolism/complications
  • Male
  • Female
  • Aged
  • Machine Learning
  • Registries
  • Middle Aged
  • Hypertension, Pulmonary/etiology
  • Risk Factors
  • Prospective Studies
  • Risk Assessment/methods
  • Dyspnea/etiology

Field of Science*

  • 3.2 Clinical medicine

Publication Type*

  • 1.1. Scientific article indexed in Web of Science and/or Scopus database

Fingerprint

Dive into the research topics of 'A machine learning approach to identify patients at risk for long-term consequences after pulmonary embolism'. Together they form a unique fingerprint.

Cite this