ViST

Vision Transformer from Scratch using only the C++ Standard Library

ViST is a C++ implementation of the Vision Transformer (ViT) model for image classification only using the C++ standard library. Currently it classifies images into 3 classes

Requirements

C++17 or newer.
Clang (or configure CMakeLists.txt)
CMake (for building the project).
stb_image.h for loading images

Setup & Installation

1. Clone the repository

Clone the repository to your local machine:

git clone https://github.com/allanhanan/ViST.git
cd ViST

2. Build with CMake

Ensure you have CMake installed.

From the project root directory, create a build directory and compile:

mkdir build
cd build
cmake ..
make

This will generate the executable `ViT` in the `build` folder.

Usage

1. Training the model

To train the model, run thecommand from the project's root directory:

./ViT

It will start training the model using the images located in the following directory structure:

program_root/
└── train/
    ├── apple/
    │   ├── image1.png
    │   ├── image2.png
    │   └── image3.png
    ├── orange/
    │   ├── image1.png
    │   ├── image2.png
    │   └── image3.png
    └── banana/
        ├── image1.png
        ├── image2.png
        └── image3.png

Note: The image directory path is currently hardcoded in the source code.

2. Testing the model

After training the model, test it on an image by running:

./ViT /path/to/model_checkpoint.bin /path/to/test_image.png

Example:

./ViT /home/allan/project/viT/vit/build/model_checkpoint.bin /home/allan/project/viT/vit/test.png

Parameters are hardcoded for now and only supports CPU training

also uses stb_image.h so it technically isnt only using the std library but dont care + L + Ratio

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
CMakeLists.txt		CMakeLists.txt
README.md		README.md
checkpoint.hpp		checkpoint.hpp
feedforward.hpp		feedforward.hpp
gradients.hpp		gradients.hpp
image_loader.hpp		image_loader.hpp
image_utils.hpp		image_utils.hpp
main.cpp		main.cpp
multi_head_attention.hpp		multi_head_attention.hpp
optimizers.hpp		optimizers.hpp
patch_embedding.cpp		patch_embedding.cpp
positional_encoding.cpp		positional_encoding.cpp
test.cpp		test.cpp
train.cpp		train.cpp
training_utils.hpp		training_utils.hpp
transformer_block.cpp		transformer_block.cpp
utils.hpp		utils.hpp
vit_model.hpp		vit_model.hpp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ViST

Requirements

Setup & Installation

1. Clone the repository

2. Build with CMake

Usage

1. Training the model

2. Testing the model

About

Uh oh!

Releases

Packages

Languages

allanhanan/ViST

Folders and files

Latest commit

History

Repository files navigation

ViST

Requirements

Setup & Installation

1. Clone the repository

2. Build with CMake

Usage

1. Training the model

2. Testing the model

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages