Skip to content

Example Workflow

MicheleBortol edited this page Apr 25, 2021 · 14 revisions

SIMPLI's example workflow and dataset

Each installation of SIMPLI includes a self-containe example/test dataset and all the metadata and configuration files required for its analysis.

SIMPLI's example data and configuration

The example dataset provided with SIMPLI consists of two Imaging Mass Cytometry derived images of normal colon mucosa. The images are derived from the ablation of two ROIs from two different FFPE blocks from two individuals who underwent surgery for the removal of colorectal cancers.

Imaging mass citometry antibody panel

The two example images contain the channels, with the intensities associated to this panel of metal conjugated antibodies.

Metal Marker Target Cells / Features
Ir191 DNA1 All nucleated cells
Sm152 CD45 All leukocytes
Yb173 CD45RO T cells
Er166 CD45RA T cells
Er170 CD3 T cells
Dy162 CD8a T cells
Ho165 PD1 T cells
Gd156 CD4 T cells, Macrophages
Gd155 FoxP3 T cells
Yb171 CD27 B cells, T cells
Dy161 CD20 B cells
Pr141 IgA B cells
Tm169 IgM B cells
Tb159 CD68 Macrophages
Nd146 CD16 Macrophages
Lu175 CD11c Macrophages, Dendritic cells
Nd150 PDL1 Macrophages, Dendritic Cells
Nd148 PanKeratin Epithelial Cells
Gd158 eCadherin Epithelial Cells
Er168 Ki67 Proliferating cells
Nd143 Vimentin Stromal cells
Dy164 CD34 Endothelial cells
Nd142 SMA Smooth muscle
Yb176 CollagenIV Basement membrane cells
Yb174 CAMK4 Various
Sm154 VEGFc Various
Sm147 IFNA5 Various

Raw image files

The raw images files used in SIMPLI's example workflow are two .txt IMC acquisition files with one Region Of Interest (ROI) each. These files are available in SIMPLI's repository and in your local SIMPLI installation folder: SIMPLI/test/raw_data/.

Example Metadata and Configuration

The metadata and configuration files required for running an analysis of the example dataset are stored at SIMPLI/test/ and include:

Running the example analysis

The most recent version of SIMPLI be downloaded and run from directly from this repository with:

nextflow run https://github.com/ciccalab/SIMPLI -profile test

In alternative the example analysis can be run from an existing installation of SIMPLI with:

nextflow main.nf -profile test

Analysis of two normal colon mucosa samples with SIMPLI.

SIMPLI Analysis Steps

A) Raw image processing

The first step in the example analysis workflow is the preprocessing of raw images and it consists of 3 processes:

A.1) Image Extraction

In this process tiff files are extracted from the raw acquisition data from imaging mass cytometry:

Inputs and parameters:

Outputs:

  • Images: Uncompressed single channel 16 bit tiff files (one for each of the 27 selected channels) ($test_output/Images/Raw/sample_name/sample_name-label-raw.tiff)
  • Metadata:
    • Metadata for all images from both samples: $test_output/Images/Raw/raw_tiff_metadata.csv
    • By sample metadata for the raw images is also output at at:
      $test_output/Images/Raw/sample_name/sample_name-raw_tiff_metadata.csv

A.2) Image normalisation

This process performs 99th percentile normalisation of the raw tiff images generated in the Image extraction process.

Inputs and parameters:

Outputs:

  • Normalised Images: Images (uncompressed 16 bit tiff) can be output in two different formats:
    • single channel tiff files (one for each of the selected channels) ($output_folder/Images/Normalized/sample_name/sample_name-label-normalized.tiff)
    • .ome.tiff files (one per sample, the order of channels is the same as in the the channel_metadata file). (output_folder/Images/Normalized/sample_name/sample_name-ALL-normalized.ome.tiff)
  • Metadata:
    • Metadata for all images from all samples: $output_folder/Images/Normalized/normalized_tiff_metadata.csv
    • By sample metadata for the normalised images is also output at at:
      • test_output/Images/Normalized/sample_name/sample_name-normalized_tiff_metadata.csv in long format.
      • test_output/Images/Normalized/sample_name/sample_name-normalized_tiff_metadata.csv in CellProfiler4 compatible wide format.

A.3) Image thresholding and masking

This process is used to perform the image preprocessing that will generate the final images, which then will be used as input for the pixel-based or the cell-based analysis. The input images for this process are derived from the images generated in the Image normalisation process.

Inputs and parameters:

In this example for each marker we:

  1. Generate a mask without background noise by thresholding with the Threshold CellProfiler4 module.
  2. Mask the normalised image with the mask to remove its background noise with theMaskImage CellProfiler4 module.
  3. Save the resulting image as an uncompressed 16 bit single channel tiff file with the SaveImages CellProfiler4 module.

Outputs:

  • Preprocessed Images: (uncompressed 16 bit single-channel tiff)
    test_output/Images/Preprocessed/sample_name/sample_name-label-Preprocessed.tiff
  • Metadata:
    • Metadata for all images from all samples $output_folder/Images/Preprocessed/preprocessed_tiff_metadata.csv
    • By sample metadata for the raw images is also output at at:
      • test_output/Images/Preprocessed/sample_name/sample_name-preprocessed_metadata.csv in long format.
      • test_output/Images/Preprocessed/sample_name-cp4-preprocessed_metadata.csv in CellProfiler4 compatible wide format.

B) Pixel-based analysis

The pixel-based approach implemented in SIMPLI enables the quantification of pixels which are positive for a specific marker or combination of markers. These marker-positive areas can be normalised over the area of the whole image, or the areas of an image mask defined by a the combination of any of the input images with logical operators.

B.1) Measurement of positve-marker areas

This process measures the areas of interest and normalises them on the selected image masks according to the input metadata. The input images for this process is derived from images generated in the image thresholding and masking process.

Inputs and parameters:

  • preprocessed_metadata_file with the tiff image metadata.
  • area_measurements_metadata = /test/metadata/marker_area_metadata.csv Path to the area_measurements_metadata file.

In this example analysis we are measuring the areas of each marker normalised over the areas of the ROI plus the following combinations of markers corresponding to different T cell phenotypes normalised over the area of the T cell population defined by CD3:

  • CD3 & CD45RA,CD3 = Naive T cells.
  • CD3 & CD8a,CD3 = CD8+ T cells.
  • CD3 & CD4 & !CD8a,CD3 = CD4+CD8- T cells. We are also measuring the following normalised areas:
  • CD68 & CD16,CD68 = CD16+ Macrophages (macrophages are defined as CD68+ areas).
  • Vimentin,PanKeratin | eCadherin = Vimentin positive areas overlapping epithelial areas (we expect to see very little to no overlap).

Outputs: The area measurements are saved in test_output/area_measurements.csv.

All areas are in pixel2.

B.2) Pixel-based analysis visualisation

Generate boxplots showing the comparisons of the distributions of normalised marker-positive areas between 2 categories of samples. The input data for this process is derived from

Inputs and parameters:

  • sample_metadata_file with the metadata of all samples used in the analysis.
  • area_measurements_file Path to the area_measurements_file.

FDR is calculated using the number of different marker values for each value of main_marker.

Outputs: The area measurements are saved in test_output/Plots/Area_Plots/Boxplots/ a separate folder is created for each main_marker. For each main_marker a pdf file (test_output/Plots/Area_Plots/Boxplots/main_marker/main_marker_area_boxplots.pdf) containing a boxplot for each value of marker associated to that main_marker.

C) Cell-based analysis

The cell-based analysis aims to investigate the qualitative and quantitative cell representation within the imaged tissue through (1) cell segmentation, cell phenotyping by unsupervised clustering and expression thresholding and spatial analysis of cell densities (homotypic spatial analysis) and distances (heterotypic spatial analysis).

C.1) Cell segmentation

Generate single-cell data is .csv format and the cell masks in tiff format. The input data for this process can is derived from images generated in the image thresholding and masking process.

Inputs and parameters:

In this example we:

  1. Generate an image corresponding to our cell membranes with the ImageMath CellProfiler4 module, by adding the following channels:
    • CD45
    • Pan-Keratin
    • E-Cadherin
  2. Identify the nuclei with the IdentifyPrimaryObjects CellProfiler4 module.
  3. Expand the nuclei annotations using the membrane image with the IdentifySecondaryObjects CellProfiler4 module to obtain the cells.
  4. Generate the cell masks with the ConvertObjectsToImage CellProfiler4 module.
  5. Measure the intensity of each marker in our panel (from the preprocessed images without background noise) with the MeasureObjectIntensity CellProfiler4 module.
  6. Measure the size/shape parameters of each cell with the MeasureObjectSizeShape CellProfiler4 module.
  7. Save the cell mask images with the with the SaveImages CellProfiler4 module.
  8. Export the single-cell data with all measurements to a .csv file (compatible with Excel) with the ExportToSpreadsheet CellProfiler4 module.

Outputs:

  • Single cell data:
    • Single cell data for all samples: test_output/Segmentation/unannotated_cells.csv
    • Single cell data for each sample separately: test_output/Segmentation/sample_name/sample_name-Cells.csv
  • Cell masks:
    Cell masks in uint16 tiff format: test_output/Segmentation/sample_name/sample_name-Cell_Mask.tiff To each cell is associated a unique identity number from 1 to 216-1. All the pixel belonging to a given cell have their value set to its identity number. Pixels not belonging to any cell are set to 0.
    These images are compatible with several other tools for downstream analysis including:
Clone this wiki locally