Skip to content

Analysis

MicheleBortol edited this page Jan 27, 2021 · 30 revisions

1. Raw data extraction output

Raw images extracted from the raw data, according to the raw_metadata_file and the tiff_type parameters, they can have the following formats:

  • .tiff: single channel tiff files,one for each of the selected channels.
  • .ome.tiff: one file, the order of channels is the same as the metadata file.

The images have an associated metadata file: raw_tiff_metadata.csv
These images and their metadata are saved at: $output_folder/Images/Raw/

2. Image preprocessing and Pixel analysis output

2.1 Image preprocessing output

Normalized tiff images

Normalized tiff images, according to the raw_metadata_file and the tiff_type parameters, they can have the following formats:

  • .tiff: single channel tiff files, one for each of the selected channels.
  • .ome.tiff: one file, order of channels in the .ome.tiff file is the same as the metadata file.

The images have an associated metadata file: normalized_tiff_metadata.csv
These images and their metadata are saved at: $output_folder/Images/Normalized/

Preprocessed tiff images

Tiff images preprocessed with CellProfiler3:

  • Single channel tiff files, as specified in the CellProfiler4 cp4_preprocessing_cppipe pipeline.
    The images are named according to this convention: SAMPLE-LABEL-Preprocessed.tiff

The images have an associated metadata file: preprocessed_tiff_metadata.csv
These images and their metadata are saved at: $output_folder/Images/Preprocessed/

2.2 Pixel Analysis output

Area measurements table

The area measurements are saved in $output_folder/area_measurements.csv. The file has the following columns:

  • sample_name
  • main_marker = combination of markers used to normalize the marker area.
  • marker = main combination of markers measured.
  • area = Area positive for the marker combination of markers.
  • main_marker_area = Area positive for the main_marker combination of markers.
  • total_ROI_area = Total area of the ROI.
  • percentage = marker / main_marker * 100.

All areas are in pixel2.

Area measurements plots (Optional)

These boxplots are generated only if there is a comparison metadata column with 2 different categories of samples, and at least one sample per category.
For each valid comparison metadata column a pdf file for each main_marker is produced. The pdf contains one boxplot for each marker associated to that main_marker. The FDR is calculated using the Benjamini-Hochberg procedure for all the markers associated to a main_marker.

3. Cell level analysis output

3.1 Cell Segmentation output

Cell masks Cell masks in uint16 tiff format. To each cell is associated a unique identity number from 1 to 216-1. All the pixel belonging to a given cell have their value set to its identity number. Pixels not belonging to any cell are set to 0. These images are compatible with several other tools for downstream analysis including:

The cell masks are saved in $output_folder/Segmentation/SAMPLE/SAMPLE-Cell_Mask.tiff The CellProfiler3 measurements by sample are saved in $output_folder/Segmentation/SAMPLE/SAMPLE-Cells.csv

3.2 Cell Type level output

Cell Type level table

The cell type level table is a .csv table with a row for each cell and the following annotations:

  • ImageNumber: CellProfiler3 specific image identifier.
  • ObjectNumber: Unique identity number from 1 to 216-1, matches the corresponding pixels in the cell masks.
  • Metadata_sample_name
  • CellProfiler4 area shape measurements (optional): Can be included if the user plans to use them for downstream analysis
  • CellProfiler4 marker intensity measurements: Name used to identify the cell type during the analysis.
  • cell_type: Name used to identify the cell type during the analysis.
  • CellName: Cell identity string in the form: Metadata_sample_name_ObjectNumber

The exact set of fields and their order depends on the CellProfiler3 pipeline params.cp3_segmentation_cppipe
The cell type level table is saved at: $output_folder/annotated_cells.csv

Cell Type level plots

The cell type level plots are saved in $output_folder/Plots/Cell_Type_Plots/ and they are divided in:

  • Barplots: $output_folder/Plots/Cell_Type_Plots/Barplots/barplots.pdf
    A .pdf file with barplots with the proportions of all cell types + unassigned cells in:

    • Each sample: one bar per sample.
    • Each category (optional): one bar per category, only for comparison metadata column with 2 categories.
  • Overlays: $output_folder/Plots/Cell_Type_Plots/Overlays/

    • One overlay-SAMPLE.tiff image per sample. Each cell is coloured by cell type according to the color specified in the cell types metadata file
    • overlay_legend.pdf: legend mapping each cell type to its color.
  • Boxplots (Optional): $output_folder/Plots/Cell_Type_Plots/Boxplots/
    For each comparison metadata column with 2 categories, a .pdf file is produced with one boxplot for each cell type + unassigned cells. The FDR is calculated with the Benjamini-Hochberg procedure.

3.3 Cell clustering level output (Optional)

For the cell clustering level output to be produced at least one cell type must be selected for clustering.

Cell clustering level table

The Cell clustering level table is a .csv table with a row for each cell in the cell types that underwent clustering and the following annotations:

  • CellName: Cell identity string in the form: Metadata_sample_name_ObjectNumber
  • Metadata_sample_name
  • Clustering resolution columns: res-RESOLUTION-ids for each clustered cell type. Clusters are numbered from 0, the same numbering is used in the plots.
  • ImageNumber: CellProfiler4 specific image identifier.
  • ObjectNumber: Unique identity number from 1 to 216-1, matches the corresponding pixels in the cell masks.
  • CellProfiler4 area shape measurements (optional): Can be included if the user plans to use them for downstream analysis
  • CellProfiler4 marker intensity measurements: Name used to identify the cell type during the analysis.
  • cell_type: Name used to identify the cell type during the analysis.

The exact set of fields and their order depends on the CellProfiler4 pipeline params.cp4_segmentation_cppipe
The annotated cell table is saved at: $output_folder/Cell_Clusters/clustered_cells.csv The same data in .csv and .RData format (Seruat object) is saved separately by cell type in: $output_folder/Cell_Clusters/CELL_TYPE

Cell cluster level plots

The cell cluster level plots are saved in $output_folder/Plots/Cell_Cluster_Plots/ and they are divided in:

  • UMAPs: $example_output/Plots/Cell_Cluster_Plots/CELL_TYPE/UMAPs/
    For each clustering resolution a .pdf file with UMAP plots colored by:

  • Boxplots (Optional): $output_folder/Plots/Cell_Type_Plots/Boxplots/
    For each comparison metadata column with 2 categories:
    For each level of resolution a .pdf file is produced, the file contains:
    - Heatmap: showing for each cluster the expression of the markers used for the clustering.
    - Boxplots: one for each cluster, with the percentage of cells belonging to that cluster on the total cells in the clustered cell type. The FDR is calculated using the Benjamini-Hochberg procedure for all clusters.

  • Heatmaps (Optional): If there is no comparison metadata column with 2 categories:
    For each level of resolution a .pdf file is produced containing an heatmap showing for each cluster the expression of the markers used for the clustering.

Clone this wiki locally