CAE-Transformer

A Transformer-based Framework for Classification of Lung Nodules

In this repository, the implementation codes related to the CAE-Transformer model are released. The detailed structure of the framework is availabele at ArXiv. The provided codes are slightly different version of the proposed framework in the sense of number of layers, transformer heads, and hyper parameters. The overal structure of the framework and the implementation is, however, the same.

IMPORTANT

!!! The provided files are not released for reproduction!!!

The aim is to provide further insight into the implementation of different parts of the proposed pipeline for those interested in developing a similar framework. Note that this implementation is particularly designed to work with a specific in-house dataset, and will not be executed on your system without setting the required paths and modifying the configuration based on tour dataset structure. In what follows, the functionality of each file in this repository is explained. You are welcome to adopt this implementation partially or fully for your project or research work.

Framework

CAE-Transformer is predictive transformer-based framework, developed to predict the invasiveness of Lung Cancer, more specifically Lung Adenocarcinoma (LUAC). The CAE-Transformer utilizes a Convolutional Auto-Encoder (CAE) to automatically extract informative features from CT slices, which are then fed to a modified transformer model to capture global inter-slice relations. We performed several experiments on an in-house dataset of 114 pathologically proven Sub-Solid Nodules (SSNs) and the obtained results demonstrate the superiority of the CAE-Transformer over the histogram/radiomics-based models, such as the model proposed in this paper, and also its DL-based counterparts.

Pipeline

The following list outlines the step-by-step process taking place in the training and test steps proposed CAE-Transformer framework.

Step 1: Lung Region Segmentation

All CT images are passed to a Lung Region Segmentation module to obtain lung areas and discard unimortant component in CT images.
The following files are used in this step:
- lung_segmentation_module.py : provides the necessary functions and classes
- segmentation_main.py : the main file to perform the segmentation and save the outputs
The segmentation module is adopted from here and can be installed using the following line of code:
```
pip install git+https://github.com/JoHof/lungmask
```
Make sure to have torch installed in your system. Otherwise you can't use the lungmask module. https://pytorch.org
Step 2: Preprocessing

All images are resized from the original size of (512,512) into (256,256) and normalized using the Max-Min normalization function. The resizing and normalization functions are available in the utils.py file.
Step 3: Convolutional Auto-Encoder (CAE) and Pre-Training

The extracted lung regions are then going to a Convolutional Auto-Encoder (CAE) to provide slice-level feature maps in an unsupervised fashion.

The CAE is first pre-trained on the LIDC-IDRI dataset, and then fined tuned on the target dataset (our in-house dataset of 114 patients). The following files are used for pre-training, fine-tuning, and saving the outputs of the CAE model:
- read_lidc_annotations.py : reads the cases and their corresponding annotations in the LIDC-IDRI dataset and save those slices with the evidence of lung nodule to be used as the pre-training data.
This code is written based on the pylidc library. The official doccumentation of the pylidc library is available at: https://pylidc.github.io/
- CAE.py : provides the functions and classes to implement the Convolutional Auto-Encoder.
- pretrain_cae.py : the main file for pre-training the model
- fine_tune_cae.py : the main file for fine-tuning the model
- cae_save_outputs-sequential.py: saves the outputs of the CAE model as a sequential data. Each sequence contains the feature maps generated for slices with the evidence of nodule in each patient.
Note: To comply with the input size requirements of the subsequet modules, all sequences are zero-padded to have the equal size of (25,256), in which 25 represents the maximum number of slices for each patient, and 256 is the number of extracted features from each slice.
Step 4: Transformer

The following codes are used to implement and train the transformer-based classifier:
- transformers.py/i> : implements the transformer class

Requirements

Tested with tensorflow-gpu 2 and keras-gpu 2.2.4 on NVIDIA's GeForce RTX 3090

Python 3.7

PyTorch 1.4.0

Torch 1.5.1

PyDicom 1.4.2 -->Installation

SimpleITK -->Installation

lungmask -->Installation

pylidc --> Installation

OpenCV

Scikit-learn

Pandas

OS

Numpy

Matplotlib

Citation

If you found this implementation and the related paepr useful in your research, please consider citing:

@article{Heidarian2021, archivePrefix = {arXiv}, arxivId = {2110.08721}, author = {Heidarian, Shahin and Afshar, Parnian and Oikonomou, Anastasia and Plataniotis, Konstantinos N. and Mohammadi, Arash}, eprint = {2110.08721}, month = {oct}, title = {{CAE-Transformer: Transformer-based Model to Predict Invasiveness of Lung Adenocarcinoma Subsolid Nodules from Non-thin Section 3D CT Scans}}, url = {http://arxiv.org/abs/2110.08721}, year = {2021} }

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CAE-Transformer

A Transformer-based Framework for Classification of Lung Nodules

IMPORTANT

!!! The provided files are not released for reproduction!!!

Framework

Pipeline

Step 1: Lung Region Segmentation

Step 2: Preprocessing

Step 3: Convolutional Auto-Encoder (CAE) and Pre-Training

Step 4: Transformer

Requirements

Citation

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
Figs		Figs
.gitattributes		.gitattributes
10-fold-ids-v2.csv		10-fold-ids-v2.csv
CAE.py		CAE.py
README.md		README.md
cae_save_outputs-sequential.py		cae_save_outputs-sequential.py
fine_tune_cae.py		fine_tune_cae.py
lung_segmentation_module.py		lung_segmentation_module.py
pretrain_cae.py		pretrain_cae.py
read_lidc_annotations.py		read_lidc_annotations.py
segmentation_main.py		segmentation_main.py
train_transformer.py		train_transformer.py
transformers.py		transformers.py
utils.py		utils.py

ShahinSHH/CAE-Transformer

Folders and files

Latest commit

History

Repository files navigation

CAE-Transformer

A Transformer-based Framework for Classification of Lung Nodules

IMPORTANT

!!! The provided files are not released for reproduction!!!

Framework

Pipeline

Step 1: Lung Region Segmentation

Step 2: Preprocessing

Step 3: Convolutional Auto-Encoder (CAE) and Pre-Training

Step 4: Transformer

Requirements

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages