This project demonstrates an end-to-end machine learning pipeline using the Iris dataset. It includes:
- A training script that builds an automated ML pipeline (with preprocessing, model training, evaluation, and saving).
- A Streamlit web application for model inference via a graphical interface.
- A FastAPI application to serve model predictions via REST API endpoints.
- A batch file that reads a
.env
file to initialize your virtual environment.
├── app.py # Streamlit app for model inference
├── api.py # FastAPI app for serving predictions via REST API
├── train_model.py # Script to build, train, and save the ML pipeline
├── iris_pipeline.pkl # Saved model pipeline (generated by train_model.py)
├── .env # Environment variable file containing VENV path (to be created)
├── init_venv.bat # Batch file to initialize the virtual environment
├── requirements.txt # Python dependencies
└── README.md # This file
- Python 3.6+
- pip (Python package installer)
- Windows OS (for the provided batch file)
Clone this repository and navigate into its directory:
git clone https://github.com/mohiteamit/simple_pipeline_using_iris.git
cd simple_pipeline_using_iris
Create a file named .env
in the project root with the following content.
Make sure to replace <path-to-your-virtual-environment>
with the actual path to your virtual environment's activation script:
VENV=<path-to-your-virtual-environment>\Scripts\Activate
You have two options:
-
With the
.env
file in place, run the batch file to initialize your virtual environment:init_venv.bat
If you prefer manual activation, run:
<path-to-your-virtual-environment>\Scripts\Activate
Once your virtual environment is active, install the required packages using requirements.txt
:
pip install -r requirements.txt
Run the training script to load the Iris dataset, preprocess the data, train a RandomForest classifier (within a pipeline), evaluate its performance, and save the pipeline to disk:
python train_model.py
After running the script, a file named iris_pipeline.pkl
will be generated.
Launch the Streamlit app to interact with the model via a graphical web interface:
streamlit run app.py
In your web browser, you can:
- Input iris feature values (sepal length, sepal width, petal length, petal width)
- Click the Predict button to view the predicted iris species
To serve predictions via a REST API, launch the FastAPI server with:
uvicorn api:app --reload
By default, the server will start at http://127.0.0.1:8000
. You can access the interactive API documentation at http://127.0.0.1:8000/docs
to test the /predict
endpoint.
-
Automated Retraining:
You may schedule the training script using tools like cron jobs, Windows Task Scheduler, or advanced scheduling tools such as Airflow if needed. -
Modularity:
The training and inference stages are separated. You can update the model by re-runningtrain_model.py
without affecting the Streamlit or FastAPI interfaces. -
Customization:
Feel free to extend the pipeline (e.g., add more preprocessing steps) or enhance the Streamlit app and FastAPI endpoints (e.g., add logging, error handling, or additional visualizations).
Contributions and improvements are welcome. Please open an issue or submit a pull request for any suggestions or enhancements.
This project is licensed under the MIT License.
For questions or further information, please reach out to [[email protected]].