This project implements a TinyVGG model for image classification, based on the architecture from the CNN Explainer website. The code is structured for modularity, allowing for easy configuration and execution of training and prediction tasks.
.
├── config.py # All configurations (hyperparameters, paths, etc.)
├── data_setup.py # For creating PyTorch DataLoaders
├── model.py # TinyVGG model definition (ensure your TinyVGG class is here)
├── engine.py # Training loop (train_step, test_step, train_model functions)
├── utils.py # Utility functions (e.g., save_model)
├── train.py # Main script to run model training
├── predict.py # Script to make predictions on new images
├── data/ # Directory for your image datasets
│ └── pizza_steak_sushi/ (Example dataset name)
│ ├── train/ # Training images
│ │ ├── class1/ (e.g., pizza)
│ │ ├── class2/ (e.g., steak)
│ │ └── ...
│ └── test/ # Testing images
│ ├── class1/
│ ├── class2/
│ └── ...
├── models/ # Directory where trained models are saved
└── README.md # This file
- Modular Design: Code is separated into logical modules for data setup, model building, training engine, and utilities.
- Configurable: Most parameters, including hyperparameters, file paths, and model settings, are managed through
config.py. - TinyVGG Implementation: A PyTorch implementation of the TinyVGG convolutional neural network.
- Training Script:
train.pyhandles the complete training pipeline, including data loading, model training, and saving the trained model. - Prediction Script:
predict.pyloads a trained model and makes predictions on a specified image. - Device Agnostic: Code attempts to use a CUDA-enabled GPU if available, otherwise defaults to CPU.
- Python 3.7+
- PyTorch
- TorchVision
- Pillow (PIL)
- tqdm (for progress bars)
You can install the necessary Python packages using pip:
pip install torch torchvision pillow tqdm-
Clone the repository (if applicable) or create the project files: Ensure all the
.pyfiles (config.py,data_setup.py,model.py,engine.py,utils.py,train.py,predict.py) are in the root directory of your project. -
Create Data Directories:
mkdir -p data/pizza_steak_sushi/train mkdir -p data/pizza_steak_sushi/test mkdir -p models
Replace
pizza_steak_sushiwith your dataset's name if different. -
Prepare Your Dataset:
- Place your training images in the
data/YOUR_DATASET_NAME/train/directory, organized into subfolders named after each class (e.g.,data/pizza_steak_sushi/train/pizza/,data/pizza_steak_sushi/train/steak/). - Place your testing images in the
data/YOUR_DATASET_NAME/test/directory, with the same class-based subfolder structure.
- Place your training images in the
-
Configure
config.py: Openconfig.pyand review/adjust the settings:- Data Settings:
TRAIN_DIR: Path to your training data.TEST_DIR: Path to your testing data.IMAGE_SIZE: Target image size for resizing (default:(64, 64)).BATCH_SIZE: Batch size for training and testing.
- Model Parameters:
INPUT_SHAPE: Number of input channels (e.g., 3 for RGB).HIDDEN_UNITS: Number of hidden units in convolutional layers.
- Training Hyperparameters:
NUM_EPOCHS: Number of training epochs.LEARNING_RATE: Learning rate for the optimizer.
- Model Saving:
MODEL_SAVE_DIR: Directory to save trained models.MODEL_NAME_PREFIX: Prefix for saved model filenames.
- Prediction Settings (for
predict.py):IMAGE_PATH_FOR_PREDICTION: Path to the image you want to predict.MODEL_PATH_FOR_PREDICTION: Path to the trained.pthmodel file to use for prediction.CLASS_NAMES_FOR_PREDICTION: Crucially, list your class names here in the exact order the model was trained on (this order is usually determined alphabetically byImageFolderor how you set it up).
- Data Settings:
- Ensure your dataset is prepared and
config.py(especiallyTRAIN_DIR,TEST_DIR, and training hyperparameters) is correctly configured. - Run the training script from the project's root directory:
python train.py
- The script will:
- Load and preprocess the data.
- Initialize the TinyVGG model.
- Train the model for the specified number of epochs, printing progress and metrics.
- Save the trained model's
state_dictto the directory specified byMODEL_SAVE_DIRinconfig.py(e.g.,models/tinyvgg_model_v1.pth).
- Ensure you have a trained model: A
.pthfile should exist in yourmodels/directory (or the path specified inMODEL_PATH_FOR_PREDICTION). - Configure
predict.pyviaconfig.py:- Open
config.py. - Set
IMAGE_PATH_FOR_PREDICTIONto the full path of the new image you want to classify. - Set
MODEL_PATH_FOR_PREDICTIONto the path of your trained model file (e.g.,models/tinyvgg_model_v11.pth). - Verify
CLASS_NAMES_FOR_PREDICTION: This list must match the order of classes the model was trained on. For example, if your training data subfolders werecat,dog,bird(andImageFolderloaded them in that order), thenCLASS_NAMES_FOR_PREDICTIONshould be["cat", "dog", "bird"].
- Open
- Run the prediction script:
python predict.py
- The script will output the predicted class label and the confidence score for the specified image.
- Dataset: To use a different dataset, update
TRAIN_DIRandTEST_DIRinconfig.pyand ensure your data is structured correctly in thedata/directory. Also, updateCLASS_NAMES_FOR_PREDICTIONfor prediction. - Model Architecture: Modify the
TinyVGGclass inmodel.pyto change the network architecture. Remember to adjustINPUT_SHAPE,HIDDEN_UNITS, or howOUTPUT_SHAPEis determined if you make significant changes. Thein_featuresfor the finalnn.Linearlayer inTinyVGGis dependent on the output size of the convolutional blocks and the inputIMAGE_SIZE. You may need to recalculate this if you change the architecture orIMAGE_SIZE. - Hyperparameters: Adjust
BATCH_SIZE,NUM_EPOCHS,LEARNING_RATE, etc., inconfig.pyto experiment with different training settings. - Image Transformations: Modify the
transforms.Compose([...])sections indata_setup.py(for training) andpredict.py(for prediction) if you need different image preprocessing steps (e.g., data augmentation, normalization). Ensure prediction transforms match training transforms.
FileNotFoundError: Double-check all paths inconfig.py(TRAIN_DIR,TEST_DIR,IMAGE_PATH_FOR_PREDICTION,MODEL_PATH_FOR_PREDICTION). Ensure the files/directories exist.- Incorrect Predictions/Low Accuracy:
- Verify that
CLASS_NAMES_FOR_PREDICTIONinconfig.pyexactly matches the order of classes the model was trained on. - Ensure image transformations in
predict.pyare identical to those used during training (especiallyIMAGE_SIZEand normalization if used). - The model might need more training (more epochs, larger dataset) or hyperparameter tuning.
- The
in_featuresfor the classifier'snn.Linearlayer inmodel.pymight be incorrect ifIMAGE_SIZEor the convolutional architecture has changed.
- Verify that
- CUDA Errors: Ensure PyTorch was installed with CUDA support if you have a compatible NVIDIA GPU. If not, the code should fall back to CPU.
RuntimeError: Mismatch in shape...: This often happens if theOUTPUT_SHAPEof the model (derived fromlen(CLASS_NAMES_FOR_PREDICTION)or number of classes in training data) doesn't match what the loaded model expects, or if thein_featuresof the classifier layer is wrong.
Feel free to fork this project and submit pull requests for improvements or bug fixes.
Remember to replace placeholders like YOUR_DATASET_NAME and ensure the paths and class names in config.py are accurate for your specific setup.
**How to use this README:**
1. Save the content above into a file named `README.md` in the root directory of your project.
2. Review it carefully and **customize** it:
* If your main dataset folder is not `pizza_steak_sushi`, change the example paths.
* If you've named `model_builder.py` as `model.py` (as assumed in recent steps), ensure the README reflects that.
* Add any specific notes about your dataset or model variations.
3. When you share your project (e.g., on GitHub), this `README.md` will be automatically displayed, providing a good overview for others (and your future self!).
This document outlines the steps to build and run this PyTorch application using Docker and Docker Compose. This ensures a consistent and reproducible environment for development and deployment.
- Docker: Ensure Docker Desktop (for Mac/Windows) or Docker Engine (for Linux) is installed and running. You can download it from docker.com.
- Docker Compose: Docker Compose V2 is typically included with Docker Desktop. For Linux, you might need to install it separately.
- (Optional) NVIDIA GPU Support:
- If you intend to use NVIDIA GPUs, ensure you have the latest NVIDIA drivers installed on your host machine.
- Install the NVIDIA Container Toolkit on your host machine. This allows Docker containers to access NVIDIA GPUs.
- Project Files:
Dockerfile: Defines the Docker image for the application.docker-compose.yml: Defines how to run the application services (including GPU support).Pipfile: Specifies Python package dependencies.Pipfile.lock: Locks package versions for reproducible builds.- Your application code (e.g.,
inference.py).
We will use Docker Compose to manage the build and run process.
If you haven't already, clone the project repository to your local machine:
git clone <your-repository-url>
cd <your-project-directory>The Dockerfile uses pipenv install --deploy, which requires Pipfile.lock to be up-to-date with Pipfile.
Troubleshooting Pipfile.lock out-of-date error:
If, during the Docker build process (Step 3), you encounter an error similar to:
Your Pipfile.lock (...) is out of date. Expected: (...).
ERROR:: Aborting deploy
This means your Pipfile.lock is not synchronized with your Pipfile. To fix this, run the following command in your project's root directory (where Pipfile is located) on your host machine:
pipenv lockThis will update Pipfile.lock. After running this command, proceed to Step 3.
Open your terminal in the root directory of the project (where docker-compose.yml and Dockerfile are located).
To build the image and run the application (e.g., execute inference.py):
docker-compose up --build--build: This flag tells Docker Compose to build the Docker image using theDockerfile. You can omit this on subsequent runs if theDockerfileor its dependencies haven't changed, and an image already exists.- The application (defined by
CMDin theDockerfile, e.g.,python3 inference.py) will start, and its output will be displayed in your terminal.
To run in detached mode (in the background):
docker-compose up --build -d-
Viewing Logs (if running in detached mode):
docker-compose logs -f app
(Replace
appwith your service name if it's different indocker-compose.yml). PressCtrl+Cto stop following logs. -
Accessing a Shell Inside the Container (for debugging): If you need to explore the container's environment or run commands manually:
- Ensure the container is running (e.g., using
docker-compose up -d). - Open a shell:
(Replace
docker-compose exec app bashappwith your service name if it's different). - Inside the container, you can navigate to
/app(the working directory) and run Python scripts or other commands.
- Ensure the container is running (e.g., using
-
Port Mapping (if applicable): If your application (
inference.py) runs a web server (e.g., on port 8000) and you have configured port mapping indocker-compose.yml(e.g.,ports: - "8000:8000"), you can access it viahttp://localhost:8000in your web browser.
To stop and remove the containers, networks, and (optionally, depending on docker-compose down flags) volumes defined by Docker Compose:
docker-compose downIf you want to remove the volumes as well:
docker-compose down -v- PyTorch Versions & CUDA: The
Pipfilespecifies PyTorch versions and a CUDA source (pytorch-cu111). Ensure these versions are valid and available from the specified PyTorch wheel index. Ifpipenv installfails during the Docker build due to version conflicts or "Could not find a version" errors, you will need to:- Consult PyTorch Previous Versions to find compatible
torch,torchvision, andtorchaudioversions for your desired CUDA version (e.g., CUDA 11.1). - Update the versions in your
Pipfile. - Run
pipenv locklocally to regeneratePipfile.lock. - Re-run
docker-compose up --build.
- Consult PyTorch Previous Versions to find compatible
- GPU Usage: The
docker-compose.ymlis configured to attempt GPU access using NVIDIA. This requires the prerequisites mentioned above (NVIDIA drivers and NVIDIA Container Toolkit on the host). If GPUs are not available or not configured correctly, PyTorch will typically fall back to CPU mode. - Development Mode Volume Mount: The
docker-compose.ymlincludesvolumes: - .:/app. This mounts your local project directory into the container. Code changes made locally will be reflected inside the container, which is useful for development. For production, you might remove this volume mount to rely solely on the code baked into the image.
- Cleaning up Docker Resources:
- To remove unused Docker images:
docker image prune - To remove unused Docker volumes:
docker volume prune - To remove unused Docker networks:
docker network prune - To remove all unused Docker resources (images, containers, volumes, networks):
docker system prune -a(Use with caution!)
- To remove unused Docker images: