A deep learning project that generates images based on next image prediction using LSTM networks. This project specifically works with the MNIST dataset to predict the next image in a sequence.
This project implements an LSTM-based neural network model to predict and generate sequential images from the MNIST dataset. The code has been organized into modular components for improved maintainability and reusability.
The system works by:
- Data Preparation: MNIST images are arranged into sequences of 4 consecutive digits.
- Input Processing: Each image is flattened from 28x28 pixels to a 784-dimensional vector.
- Sequence Learning: The LSTM network learns patterns from these sequences.
- Prediction: Given a sequence of images, the model predicts what the next image should be.
- Generation: By feeding predictions back into the model, we can generate new images in a chain.
For example, if the model is shown images of digits [5,6,7,8], it should predict an image of digit 9. By iteratively using its own outputs as inputs, the model can generate sequences of related digits.
Input
- The input data is a list of images, each image is a 2D array of shape (28, 28).
- The output data is a image.
- All images are in a mathematical format.
- Each image is one larger than the previous one and one smaller than the next.
Inputs | Output |
---|---|
0 - 1 - 2 - 3 | 4 |
1 - 2 - 3 - 4 | 5 |
2 - 3 - 4 - 5 | 6 |
3 - 4 - 5 - 6 | 7 |
4 - 5 - 6 - 7 | 8 |
5 - 6 - 7 - 8 | 9 |
The project has been refactored into modular components:
data.py
: Data loading and preparationpreprocessing.py
: Data preprocessing and sequence generationmodel.py
: LSTM model implementationmain.ipynb
: Main training and evaluation workflow
The NextImagePredictor
class extends Keras' Sequential model to create a specialized LSTM network for image sequence prediction
The model was trained with different optimizers to compare performance. Below are the results obtained with each optimizer.
The following images were generated using the SGD optimizer:
The following images were generated using the RMSprop optimizer:
The following images were generated using the Adam optimizer: