Actualités RCTS

Machine Learning Imputation in Healthcare

Published on 14/11/2025      Reading time : 5 minutes

 

Machine Learning for Missing Data Imputation in Healthcare Research: A Systematic Review of Methods and Applications

Authors: T. Porte, M. Swital, N. Sedmak, C. Bouvard, A. Gougeon, F. Roux, F. Mistretta, A. Lajoinie

Affiliations: RCTs Lyon & Laboratory of Biometry and Evolutionary Biology, UMR 5558 CNRS, University of Lyon 1

We are proud to share that our poster was showcased and presented at the ISPOR Europe 2025 congress held in Glasgow. This work highlights innovative approaches to handling missing data in healthcare studies using machine learning techniques, contributing to the advancement of data-driven research in the field.

 

TPO

 

Study Objective

Primary Objective

To investigate machine learning (ML) methods used to impute missing data in healthcare research, particularly in real-world data (RWD) studies. Missing data is a major source of bias and can compromise the validity of clinical analyses.

Specific Focus

To evaluate the performance of ML-based imputation techniques and their applicability compared to traditional methods in healthcare datasets.

Methodology

Design: Systematic literature review
Database: MEDLINE search
Period: Studies published since 2020
Selection: 166 studies initially selected → 7 studies included after full-text review

Key findings

Key Analysis Points
  • ML-based imputation methods offer improved predictive accuracy and reliability
  • ML techniques are more robust and suitable for complex datasets than traditional methods
  • Challenges remain, especially in cases of high missingness
  • Further methodological refinement is needed for optimal clinical applications
Data Sources

Connected Medical Devices (n=2)
Clinical Registries (n=3)
Healthcare Databases (n=1)

ML Imputation Techniques

Multiple Imputation using MICE
Random Forest-based Methods
K-Nearest Neighbor Imputation
Advanced ML techniques

Demonstrated Benefits

• Improved predictive accuracy
• Enhanced reliability over traditional methods
• Better handling of complex datasets
• Reduced bias from missing data

Performance Assessment

• Comparison with traditional imputation
• Predictive model validation
• Robustness testing
• Clinical applicability evaluation

Conclusion and Perspectives

Machine learning for missing data imputation is a promising approach to enhance the robustness of predictive models in healthcare. While ML methods outperform traditional techniques, caution is advised in interpreting results when missingness is high. Continued research is essential to refine these methods and ensure reliable clinical applications.

Vous souhaitez échanger
avec nos équipes

Contactez-nous

D’autres articles qui pourraient
vous intéresser

Synthetic Patient Data for Better ML in Healthcare
Actualités RCTS - Publications - Autres - Facultatif

Synthetic Patient Data for Better ML in Healthcare

La soumission réglementaire d’un essai clinique : le CTIS
Actualités RCTS - Facultatif

La soumission réglementaire d’un essai clinique : le CTIS

Participation de RCTs au FIRC 2025
Événements - Actualités RCTS - Facultatif

Participation de RCTs au FIRC 2025