Fraud Detection System

A containerized fraud detection system that leverages machine learning to identify fraudulent credit card transactions. This project demonstrates a complete data science pipeline—from data processing and model training to real-time API monitoring and BI visualization using Apache Superset.

Overview

This project is a comprehensive solution for fraud detection in credit card transactions. It includes:

Data Processing: Loading, cleaning, and preprocessing data.
Model Training: Using a RandomForest classifier for fraud detection.
Streaming Simulation: Generating synthetic transaction data and simulating a live data stream.
API Endpoints: Providing endpoints to monitor system status, metrics, statistics, and transactions.
Visualization: Leveraging Apache Superset for interactive dashboarding and data exploration.
Containerization: Packaging all services using Docker and orchestrating them with Docker Compose.

Features

Machine Learning Model: Trained with historical data to detect fraudulent transactions.
Real-time Data Processing: Simulates a continuous stream of transactions with on-the-fly model updates.
REST API: Exposes endpoints for tracking model metrics and transaction logs.
BI Dashboard: Integrated Apache Superset instance for visual exploration of transaction data.
Modular Codebase: Structured into multiple modules for ease of testing and scaling.
Containerized Deployment: Easily deployable with Docker and Docker Compose.

Repository Structure

fraud-detection-system/
├── app/
│   ├── main.py                  # Main application code (Flask server)
│   ├── utils/                   # Utility modules
│   │   ├── data_processing.py   # Data preprocessing functions
│   │   ├── model.py             # Machine learning model functions
│   │   └── metrics.py           # Model metric calculation and logging
├── data/
│   └── test_creditcard_2023.csv # Demo dataset for testing
├── docs/
│   ├── model_explanation.md     # Explanation of the ML model
│   ├── api_reference.md         # API documentation
│   └── system_architecture.md   # System architecture overview
├── notebooks/
│   ├── exploratory_analysis.ipynb # Exploratory data analysis of transactions
│   └── model_development.ipynb  # Model development and testing
├── superset/
│   ├── Dockerfile               # Dockerfile for Superset service
│   └── superset_config.py       # Superset configuration
├── tests/                       # Unit tests for the project
├── .env.example                 # Template for environment variables
├── .gitignore
├── docker-compose.yml           # Docker Compose configuration for all services
├── Dockerfile                   # Dockerfile for the Flask application
├── LICENSE                      # Project license (e.g., MIT)
├── README.md                    # Project README (this file)
└── requirements.txt             # Python dependencies

Installation

Clone the Repository:

git clone https://github.com/yourusername/fraud-detection-system.git
cd fraud-detection-system

Configure Environment Variables:
- Copy the environment variables template:
```
cp .env.example .env
```
- Modify the .env file if necessary.
Build and Run with Docker Compose:
```
docker-compose up --build
```
This command starts the following services:
- db: PostgreSQL database.
- web: Flask application running on port 8000.
- superset: Apache Superset dashboard running on port 8088.

Usage

Flask API:
- Check the app status at: http://localhost:8000/
- The app simulates real-time data ingestion and updates model metrics.
Apache Superset:
- Access the Superset dashboard at: http://localhost:8088/
- Use Superset to build interactive dashboards and explore transaction data.

API Endpoints

GET /
Returns the application status and current server time.

{
  "status": "running",
  "time": "2023-10-XXTXX:XX:XX"
}

GET /metrics
Retrieves the latest model performance metrics (precision, recall, F1 score, and accuracy).
GET /statistics
Provides detailed statistics (mean and standard deviation) for each feature.
GET /transactions
Lists the latest 100 transactions, including actual and predicted classes.

For more details, refer to docs/api_reference.md.

Architecture

The system is composed of several modular components:

Data Processing & Model Training:
Located in app/utils/, these modules handle data cleaning, feature scaling, model training, and predictions.
Flask Application:
Acts as the API gateway for handling data streams, processing transactions, and logging to the database.
Database:
A PostgreSQL database stores transaction logs, model metrics, and data statistics.
BI Dashboard:
Apache Superset (configured in the superset/ directory) provides interactive data visualization capabilities.
Containerization:
Docker and Docker Compose ensure environment consistency and ease of deployment.

For additional details, see docs/system_architecture.md.

Testing

Unit tests are provided in the tests/ directory. To run the tests:

# Install dependencies if not already installed
pip install -r requirements.txt

# Run all tests using unittest
python -m unittest discover -s tests

License

This project is licensed under the MIT License. See the LICENSE file for details.

Contact

For any questions or feedback, please contact:
Uladzimir Manulenka
[email protected]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Fraud Detection System

Table of Contents

Overview

Features

Repository Structure

Installation

Usage

API Endpoints

Architecture

Testing

License

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
app		app
docs		docs
notebooks		notebooks
superset		superset
tests		tests
.env.example		.env.example
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt

License

Jim-by/fraud-detection-system

Folders and files

Latest commit

History

Repository files navigation

Fraud Detection System

Table of Contents

Overview

Features

Repository Structure

Installation

Usage

API Endpoints

Architecture

Testing

License

Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages