Skip to content

PIDgeon: pipeline for processing cytometry data after PID screening with generation of patient reports

Notifications You must be signed in to change notification settings

saeyslab/PIDgeon

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

70 Commits
 
 
 
 
 
 

Repository files navigation

PIDgeon

Code used to generate the PIDgeon pipeline.

Design of an automated computational cytometry pipeline

An automated computational cytometry was designed. This work included the development and validation of a FlowSOM-based model for automated population identification, followed by training and validation of predictive models for lymphoid-PID diagnosis. Finally, the pipeline can generate interpretable reports for patients with new PID screening requests in a clinical context.

man/figures/Overview.PNG

Establishing a standardized dataset for training and validation

The computational pipeline was designed, trained and validated using 4 independent patient cohorts, including a training dataset and 3 multi-center validation data sets.

Design and validation of a reference FlowSOM tree

The first part of the PIDgeon pipeline involved the development and validation of a reference FlowSOM tree using healthy control blood samples.

The validation of this reference FlowSOM tree included:

  • the comparison of FlowSOM-based cell counts from the healthy control files with well-established age-matched healthy controls ranges
  • The patients samples from the trainingsdata were preprocessed, mapped onto the reference FlowSOM tree and the features were extracted. Thereafter, a correlation analysis was conducted between the flowFOM-based features of these patients with the cell counts retrieved from conventional analysis through manual gating.

Training and validation of predictive models

The second part of the PIDgeon pipeline involved the design and optimization of a diagnostic model tailored to identify lymphoid-PID during the early diagnostic PID workup and to categorize lymphoid-PID based on the IUIS classification. Both a non-hierarchical 6-class model and a 3-step 6-class hierarchical model were trained using the training dataset.

The clinical utility of PIDgeon as a flow-based PID screening tool was validated using independent multi-center datasets collected in 4 EuroFlow centers (Salamanca, Prague, Leiden and Ghent), all following the EuroFlow standard operating procedures.

SHAP analysis and force plots

To allow for in-dept interpretation of the hierarchical model and to gain insight in which features had the most impact on the prediction in the different steps of the model, explainable SHAP values were computed for the Ghent validation data set. These SHAP values indicate the importance of certain features in the patient prediction.

Understanding the diagnoses made by the predictive model and providing immunophenotypical information of the patient sample can also be inspected on a patient level using force plots, displaying which features were pushing the predictive model towards a certain diagnosis.

Generation of patient reports

The final aim of PIDgeon is to generate fast and interpretable report for new patients with PID suspicion for whom flow-based PID screening is requested. As a result, an easy adaptable and fast module was built within the pipeline that allows new patients' data files to be uploaded and a patient-centred report to be generated.

About

PIDgeon: pipeline for processing cytometry data after PID screening with generation of patient reports

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •