Usage: ONTraC <--NN-dir NN_DIR> <--GNN-dir GNN_DIR> <--NT-dir NT_DIR> <--meta-input META_INPUT>
[--n-cpu N_CPU] [--n-neighbors N_NEIGHBORS] [--n-local N_LOCAL] [--device DEVICE] [--epochs EPOCHS] [--patience PATIENCE]
[--min-delta MIN_DELTA] [--min-epochs MIN_EPOCHS] [--batch-size BATCH_SIZE] [-s SEED] [--lr LR] [--hidden-feats HIDDEN_FEATS]
[--n-gcn-layers N_GCN_LAYERS] [-k K] [--modularity-loss-weight MODULARITY_LOSS_WEIGHT] [--purity-loss-weight PURITY_LOSS_WEIGHT]
[--regularization-loss-weight REGULARIZATION_LOSS_WEIGHT] [--beta BETA] [--trajectory-construct TRAJECTORY_CONSTRUCT]
All steps of ONTraC including niche network construction, GNN, and niche construction.
Options:
--version show program's version number and exit
-h, --help show this help message and exit
IO:
--NN-dir=NN_DIR Directory for niche network outputs.
--GNN-dir=GNN_DIR Directory for the GNN output.
--NT-dir=NT_DIR Directory for the niche trajectory output.
--meta-input=META_INPUT
Meta data file in csv format. Each row is a cell. The first column should be the cell name with column name
Cell_ID. Coordinates (x, y) and sample should be included. Cell type is required for cell-level data.
--preprocessing-dir=PREPROCESSING_DIR
This options will be deprecated from v3.0. Please use --NN-dir instead.
--NTScore-dir=NTSCORE_DIR
This options will be deprecated from v3.0. Please use --NT-dir instead.
-d DATASET, --dataset=DATASET
This options will be deprecated from v3.0. Please use --meta-input instead.
Niche Network Construction:
--n-cpu=N_CPU Number of CPUs used for parallel computing in dataset preprocessing. Default is 4.
--n-neighbors=N_NEIGHBORS
Number of neighbors used for kNN graph construction. It should be less than the number of cells in each
sample. Default is 50.
--n-local=N_LOCAL Specifies the nth closest local neighbors used for gaussian distance normalization. It should be less than
the number of cells in each sample. Default is 20.
Options for GNN training:
--device=DEVICE Device for training. We support cpu and cuda now. Auto select if not specified.
--epochs=EPOCHS Number of maximum epochs for training. Default is 1000.
--patience=PATIENCE
Number of epochs wait for better result. Default is 100.
--min-delta=MIN_DELTA
Minimum delta for better result. Default is 0.001
--min-epochs=MIN_EPOCHS
Minimum number of epochs for training. Default is 50. Set to 0 to disable.
--batch-size=BATCH_SIZE
Batch size for training. Default is 0 for whole dataset.
-s SEED, --seed=SEED
Random seed for training. Default is random.
--lr=LR Learning rate for training. Default is 0.03.
--hidden-feats=HIDDEN_FEATS
Number of hidden features. Default is 4.
--n-gcn-layers=N_GCN_LAYERS
Number of GCN layers. Default is 2.
-k K, --k-clusters=K
Number of niche clusters. Default is 6.
--modularity-loss-weight=MODULARITY_LOSS_WEIGHT
Weight for modularity loss. Default is 0.3.
--purity-loss-weight=PURITY_LOSS_WEIGHT
Weight for purity loss. Default is 300.
--regularization-loss-weight=REGULARIZATION_LOSS_WEIGHT
Weight for regularization loss. Default is 0.1.
--beta=BETA Beta value control niche cluster assignment matrix. Default is 0.03.
Options for niche trajectory:
--trajectory-construct=TRAJECTORY_CONSTRUCT
Method to construct the niche trajectory. Default is 'BF' (brute-force). A faster alternative is 'TSP'.
Previouly named as createDataSet.
Usage: ONTraC_NN <--NN-dir NN_DIR> <--meta-input META_INPUT>
[--n-cpu N_CPU] [--n-neighbors N_NEIGHBORS] [--n-local N_LOCAL]
Create niche network and calculate features (normalized cell type composition). (Step 1 of ONTraC)
Options:
--version show program's version number and exit
-h, --help show this help message and exit
IO:
--NN-dir=NN_DIR Directory for niche network outputs.
--meta-input=META_INPUT
Meta data file in csv format. Each row is a cell. The first column should be the cell name with column name
Cell_ID. Coordinates (x, y) and sample should be included. Cell type is required for cell-level data.
--preprocessing-dir=PREPROCESSING_DIR
This options will be deprecated from v3.0. Please use --NN-dir instead.
-d DATASET, --dataset=DATASET
This options will be deprecated from v3.0. Please use --meta-input instead.
Niche Network Construction:
--n-cpu=N_CPU Number of CPUs used for parallel computing in dataset preprocessing. Default is 4.
--n-neighbors=N_NEIGHBORS
Number of neighbors used for kNN graph construction. It should be less than the number of cells in each
sample. Default is 50.
--n-local=N_LOCAL Specifies the nth closest local neighbors used for gaussian distance normalization. It should be less than
the number of cells in each sample. Default is 20.
Usage: ONTraC_GNN <--NN-dir NN_DIR> <--GNN-dir GNN_DIR> [--device DEVICE]
[--epochs EPOCHS] [--patience PATIENCE] [--min-delta MIN_DELTA] [--min-epochs MIN_EPOCHS] [--batch-size BATCH_SIZE]
[-s SEED] [--lr LR] [--hidden-feats HIDDEN_FEATS] [--n-gcn-layers N_GCN_LAYERS] [-k K]
[--modularity-loss-weight MODULARITY_LOSS_WEIGHT] [--purity-loss-weight PURITY_LOSS_WEIGHT]
[--regularization-loss-weight REGULARIZATION_LOSS_WEIGHT] [--beta BETA]
Graph Neural Network (GNN). The core algorithm of ONTraC. (Step 2/3 of ONTraC)
Options:
--version show program's version number and exit
-h, --help show this help message and exit
IO:
--NN-dir=NN_DIR Directory for niche network outputs.
--GNN-dir=GNN_DIR Directory for the GNN output.
--preprocessing-dir=PREPROCESSING_DIR
This options will be deprecated from v3.0. Please use --NN-dir instead.
Options for GNN training:
--device=DEVICE Device for training. We support cpu and cuda now. Auto select if not specified.
--epochs=EPOCHS Number of maximum epochs for training. Default is 1000.
--patience=PATIENCE
Number of epochs wait for better result. Default is 100.
--min-delta=MIN_DELTA
Minimum delta for better result. Default is 0.001
--min-epochs=MIN_EPOCHS
Minimum number of epochs for training. Default is 50. Set to 0 to disable.
--batch-size=BATCH_SIZE
Batch size for training. Default is 0 for whole dataset.
-s SEED, --seed=SEED
Random seed for training. Default is random.
--lr=LR Learning rate for training. Default is 0.03.
--hidden-feats=HIDDEN_FEATS
Number of hidden features. Default is 4.
--n-gcn-layers=N_GCN_LAYERS
Number of GCN layers. Default is 2.
-k K, --k-clusters=K
Number of niche clusters. Default is 6.
--modularity-loss-weight=MODULARITY_LOSS_WEIGHT
Weight for modularity loss. Default is 0.3.
--purity-loss-weight=PURITY_LOSS_WEIGHT
Weight for purity loss. Default is 300.
--regularization-loss-weight=REGULARIZATION_LOSS_WEIGHT
Weight for regularization loss. Default is 0.1.
--beta=BETA Beta value control niche cluster assignment matrix. Default is 0.03.
Previouly named as NicheTrajectory.
Usage: ONTraC_NT <--NN-dir NN_DIR> <--GNN-dir GNN_DIR> <--NT-dir NT_DIR>
[--trajectory-construct TRAJECTORY_CONSTRUCT]
ONTraC_NT: construct niche trajectory for niche cluster and project the NT score to each cell. (Step 4 of ONTraC)
Options:
--version show program's version number and exit
-h, --help show this help message and exit
IO:
--NN-dir=NN_DIR Directory for niche network outputs.
--GNN-dir=GNN_DIR Directory for the GNN output.
--NT-dir=NT_DIR Directory for the niche trajectory output.
--preprocessing-dir=PREPROCESSING_DIR
This options will be deprecated from v3.0. Please use --NN-dir instead.
--NTScore-dir=NTSCORE_DIR
This options will be deprecated from v3.0. Please use --NT-dir instead.
Options for niche trajectory:
--trajectory-construct=TRAJECTORY_CONSTRUCT
Method to construct the niche trajectory. Default is 'BF' (brute-force). A faster alternative is 'TSP'.
Usage: ONTraC_analysis [--NN-dir NN_DIR] [--GNN-dir GNN_DIR] [--NT-dir NT_DIR] [-o OUTPUT]
[--meta-input META_INPUT] [-l LOG] [-r REVERSE] [-s SAMPLE] [--scale-factor SCALE_FACTOR]
[--suppress-cell-type-composition] [--suppress-niche-cluster-loadings] [--suppress-niche-trajectory]
ONTraC_analysis: analysis of ONTraC results
Options:
--version show program's version number and exit
-h, --help show this help message and exit
IO:
--NN-dir=NN_DIR Directory for niche network outputs.
--GNN-dir=GNN_DIR Directory for the GNN output.
--NT-dir=NT_DIR Directory for the niche trajectory output.
-o OUTPUT, --output=OUTPUT
Directory for analysis output.
--meta-input=META_INPUT
Meta data file in csv format. Each row is a cell. The first column should be the cell name with column name
Cell_ID. Coordinates (x, y) and sample should be included. Cell type is required for cell-level data.
-l LOG, --log=LOG Log file.
--preprocessing-dir=PREPROCESSING_DIR
This options will be deprecated from v3.0. Please use --NN-dir instead.
--NTScore-dir=NTSCORE_DIR
This options will be deprecated from v3.0. Please use --NT-dir instead.
-d DATASET, --dataset=DATASET
This options will be deprecated from v3.0. Please use --meta-input instead.
Visualization options:
-r, --reverse Reverse the NT score during visualization.
-s, --sample Plot each sample separately.
--scale-factor=SCALE_FACTOR
Scale factor control the size of spatial-based plots.
Suppress options:
--suppress-cell-type-composition
Skip the cell type composition visualization. It would be useful when the number of cell types is large.
--suppress-niche-cluster-loadings
Skip the niche cluster loadings visualization. It would be useful when the number of clusters or sample
size is large.
--suppress-niche-trajectory
Skip the niche trajectory related visualization.
Previouly named as ONTraC_GP.
Usage: ONTraC_GT <--NN-dir NN_DIR> <--GNN-dir GNN_DIR> <--NT-dir NT_DIR> [--device DEVICE]
[--epochs EPOCHS] [--patience PATIENCE] [--min-delta MIN_DELTA] [--min-epochs MIN_EPOCHS] [--batch-size BATCH_SIZE]
[-s SEED] [--lr LR] [--hidden-feats HIDDEN_FEATS] [--n-gcn-layers N_GCN_LAYERS] [-k K]
[--modularity-loss-weight MODULARITY_LOSS_WEIGHT] [--purity-loss-weight PURITY_LOSS_WEIGHT]
[--regularization-loss-weight REGULARIZATION_LOSS_WEIGHT] [--beta BETA] [--trajectory-construct TRAJECTORY_CONSTRUCT]
ONTraC_GT: GNN and Niche Trajectory
Options:
--version show program's version number and exit
-h, --help show this help message and exit
IO:
--NN-dir=NN_DIR Directory for niche network outputs.
--GNN-dir=GNN_DIR Directory for the GNN output.
--NT-dir=NT_DIR Directory for the niche trajectory output.
--preprocessing-dir=PREPROCESSING_DIR
This options will be deprecated from v3.0. Please use --NN-dir instead.
--NTScore-dir=NTSCORE_DIR
This options will be deprecated from v3.0. Please use --NT-dir instead.
Options for GNN training:
--device=DEVICE Device for training. We support cpu and cuda now. Auto select if not specified.
--epochs=EPOCHS Number of maximum epochs for training. Default is 1000.
--patience=PATIENCE
Number of epochs wait for better result. Default is 100.
--min-delta=MIN_DELTA
Minimum delta for better result. Default is 0.001
--min-epochs=MIN_EPOCHS
Minimum number of epochs for training. Default is 50. Set to 0 to disable.
--batch-size=BATCH_SIZE
Batch size for training. Default is 0 for whole dataset.
-s SEED, --seed=SEED
Random seed for training. Default is random.
--lr=LR Learning rate for training. Default is 0.03.
--hidden-feats=HIDDEN_FEATS
Number of hidden features. Default is 4.
--n-gcn-layers=N_GCN_LAYERS
Number of GCN layers. Default is 2.
-k K, --k-clusters=K
Number of niche clusters. Default is 6.
--modularity-loss-weight=MODULARITY_LOSS_WEIGHT
Weight for modularity loss. Default is 0.3.
--purity-loss-weight=PURITY_LOSS_WEIGHT
Weight for purity loss. Default is 300.
--regularization-loss-weight=REGULARIZATION_LOSS_WEIGHT
Weight for regularization loss. Default is 0.1.
--beta=BETA Beta value control niche cluster assignment matrix. Default is 0.03.
Options for niche trajectory:
--trajectory-construct=TRAJECTORY_CONSTRUCT
Method to construct the niche trajectory. Default is 'BF' (brute-force). A faster alternative is 'TSP'.
A detailed explanation for some parameters is listed below.
The training process will terminated if the model does not improve after the number of epochs set by this parameter.
The model will be considered improved if the total loss decreases by the propotion set by this parameter.
hidden-feats
The number of niche features after GCN processing (step2).
The modularity loss controls the spatial smoothness of niche clusters.
The purity loss controls the purity of cell type composition within each niche cluster.
The regularization loss controls the balance of niche cluster sizes. The higher this weight is set, the more equal the size of each niche cluster.
The beta value of the softmax function used in generating niche cluster assignment matrix.