Disease Course Sequence Subtyping with Subtype and Stage Inference#

by Alex Young

../../../_images/sustain.png

Disease Course Sequence Subtyping: Subtype and Stage Inference

Subtype and Stage Inference (SuStaIn) is a generalisation of event-based modelling that adds clustering to discover multiple data-driven sequences of disease progression using cross-sectional data.

The software:

  • constructs a subtype model of a chronic, progressive disease consisting of multiple pathophysiological cascades (fine-grained temporal sequences of events);

  • stages and subtypes individuals within the model, representing cumulative abnormality along each subtype progression sequence;

  • does this all probabilistically and without predefined biomarker cutpoints.

Software Python package Open source Tutorials

The software for classical SuStaIn is distributed via the UCL POND group’s GitHub account.

The software should operate across operating systems, but specific requirements, e.g., python package versions, are detailed in each repository.

Usage#

The pySuStaIn package includes user-friendly functions to perform key operations in the Disease Course Sequencing pipeline:

pySuStaIn.ZscoreSustain: run_sustain_algorithm(...)

Converts multimodal biomarker data into event and subtype probabilities by fitting mixture models to patient/control data using Kernel Density Estimation (?).

pySuStaIn.ZscoreSustain._plot_sustain_model

Plotting tools for visualizing model outputs`

Tutorial(s)#

We have developed an introductory tutorial to understand Disease Course Sequence Subtyping. We are planning to provide an example on real data from a publicly available dataset.

Tutorial 1: SuStaIn and simulated data

This introduction to Subtype and Stage Inference is a walkthrough where you will fit a subtype model using the pySuStaIn software on simulated data.

Go to the tutorial 30 minutes cross-sectional data

Tutorial 2: SuStaIn and real data

This planned walkthrough invovles fitting a subtype model using the pySuStaIn software on real data.

Probably data from ADNI (data will not be provided here).

Tutorial link will go here