Skip to content

Hansamalee0630/Data_Engineering_Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Delivergate Customer Orders App

Customer Orders Dashboard

This is a Streamlit dashboard application that analyzes customer order data from a MySQL database.

Directory Structure

NEW_ASSIGNMENT/
├── .conda/
├── .streamlit/
├── config/
│   └── config.py
├── data/
│   ├── processed/
│       ├── customers_cleaned.csv
│       └── orders_cleaned.csv
│   └── raw/
│       ├── customers.csv
│       └── order.csv
├── notebooks/
│   └── model.ipynb
├── src/
│   └── app/
│       ├── database_utils.py
│       ├── import_data.py
│       ├── ml_utils.py
│       └── test_db_connection.py
├── .env
├── .gitignore
├── README.md
└── requirements.txt

Setup Instructions

  1. Clone this repository:
git clone <repository-url>
cd <repository-name>
  1. Create a virtual environment and activate it:
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
  1. Install the required packages:
pip install -r requirements.txt
  1. Create a MySQL database and tables:
CREATE DATABASE your_database;
USE your_database;

CREATE TABLE customers (
    customer_id INT PRIMARY KEY,
    name VARCHAR(255),
    email VARCHAR(255)
);

CREATE TABLE orders (
    id INT PRIMARY KEY,
    display_order_id VARCHAR(10),
    total_amount DECIMAL(10, 2),
    created_at DATETIME,
    customer_id INT
);
  1. Set up environment variables: Create a .env file in the root directory with the following configuration:
DB_HOST=localhost
DB_USER=your_username
DB_PASSWORD=your_password
DB_NAME=your_database
  1. Run config file:
python config/config.py
  1. Import initial data:
python src/app/import_data.py

8.Run database_utils file:

python src/utils/database_utils.py
  1. Check if the DB connection established or not:
python src/utils/test_db_cennection.py
  1. Run the Streamlit app:
streamlit run src/app/streamlit_app.py

Features

  • Date range filtering for orders
  • Minimum spend and order count filters
  • Top 10 customers visualization
  • Revenue over time analysis
  • Summary metrics
  • Detailed order data table

Machine Learning Model

The application includes a lenear regression model that predicts whether a customer is likely to be a repeat purchaser based on their order history and spending patterns.

Project Components

  • src/app/streamlit_app.py: Main Streamlit application
  • src/app/database_utils.py: Database connection and query utilities
  • src/app/ml_utils.py: Machine learning model implementation
  • src/app/import_data.py: Data import utilities
  • src/app/test_db_connection.py: Database connection testing
  • config/config.py: Configuration settings
  • notebooks/model.ipynb: Data analysis and LR model include notebook
  • requirements.txt: Required Python packages
  • .env: Environment variables (not tracked in git)

Development

  • Use the notebooks/model.ipynb for data analysis and feature development
  • Run tests using: python -m pytest tests/
  • Update requirements using: pip freeze > requirements.txt

Note

  • Make sure to properly secure your database credentials using the .env file
  • Never commit sensitive information to the repository
  • For development, use the provided test database connection utility
  • Check logs directory for any error messages during data import

Contributing

  1. Fork the repository
  2. Create your feature branch
  3. Commit your changes
  4. Push to the branch
  5. Create a new Pull Request

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published