UM-Projects

!! EXCEPT ASL & ANIMAL CLASSIFICATION, ALL MODELS ARE RUNNING SUCCESSFULLY, THANK YOU FOR READING !!

UM-Projects

Unified Mentor Internship Projects

Project 1: Animal Image Classification

Project 2: Forest Cover Type Prediction

Project 1: Image Classification of Animals

Project Overview

This project involves building a deep learning model to classify images of animals into one of 15 distinct categories. The dataset consists of images for each class, and the model is designed to identify the correct animal based on the input image. Techniques such as Convolutional Neural Networks (CNN) and Transfer Learning are utilized to build an efficient and accurate classifier.

Dataset

The dataset consists of 15 folders, each containing images of a particular animal species. All images are 224x224 pixels with 3 color channels (RGB), making them suitable for image classification tasks.

Animal Classes:

Bear
Bird
Cat
Cow
Deer
Dog
Dolphin
Elephant
Giraffe
Horse
Kangaroo
Lion
Panda
Tiger
Zebra

Dataset Specifications:

Image Dimensions: 224x224 pixels
Color Channels: 3 (RGB)
Total Classes: 15

Project Structure

.
├── data/
│   ├── train/
│   │   ├── Bear/
│   │   ├── Bird/
│   │   ├── ... (13 other classes)
│   └── test/
│       ├── Bear/
│       ├── Bird/
│       ├── ... (13 other classes)
├── notebooks/
│   ├── Image_Classification.ipynb
├── models/
│   └── model.h5  # Trained model saved here
├── README.md
└── requirements.txt

Model Architecture

The project uses Convolutional Neural Networks (CNN) to extract features from the images and classify them into the correct animal category. Additionally, Transfer Learning is employed using pre-trained models like VGG16 or ResNet50 to improve performance.

CNN Model

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout

model = Sequential()

model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(224, 224, 3)))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(128, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Flatten())
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(15, activation='softmax'))  # 15 classes

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.summary()

Transfer Learning Model (Using VGG16)

from tensorflow.keras.applications import VGG16
from tensorflow.keras.layers import GlobalAveragePooling2D

base_model = VGG16(weights='imagenet', include_top=False, input_shape=(224, 224, 3))

base_model.trainable = False  # Freeze the base model

model = Sequential([
    base_model,
    GlobalAveragePooling2D(),
    Dense(512, activation='relu'),
    Dropout(0.5),
    Dense(15, activation='softmax')
])

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.summary()

Training Process

The model is trained on the dataset using data augmentation techniques to increase model robustness and prevent overfitting.

Data Augmentation and Preprocessing

from tensorflow.keras.preprocessing.image import ImageDataGenerator

img_height, img_width = 224, 224

train_datagen = ImageDataGenerator(
    rescale=1./255,
    rotation_range=30,
    width_shift_range=0.2,
    height_shift_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,
    validation_split=0.2  # 20% of data for validation
)

train_generator = train_datagen.flow_from_directory(
    'data/train',
    target_size=(img_height, img_width),
    batch_size=32,
    class_mode='categorical',
    subset='training'
)

validation_generator = train_datagen.flow_from_directory(
    'data/train',
    target_size=(img_height, img_width),
    batch_size=32,
    class_mode='categorical',
    subset='validation'
)

Model Training

epochs = 10

history = model.fit(
    train_generator,
    validation_data=validation_generator,
    epochs=epochs,
    steps_per_epoch=train_generator.samples // train_generator.batch_size,
    validation_steps=validation_generator.samples // validation_generator.batch_size
)

Evaluation and Results

After training the model, it was evaluated on the validation set:

loss, accuracy = model.evaluate(validation_generator)
print(f'Validation accuracy: {accuracy * 100:.2f}%')

The current model achieved an accuracy of approximately 54.31% on the validation set. There is scope for improvement by using advanced techniques such as Transfer Learning and Hyperparameter Tuning.

How to Run

Prerequisites:

Python 3.x
TensorFlow 2.x
Keras

Steps to Run:

Clone the repository:

git clone https://github.com/yourusername/animal-image-classification.git

Install the required dependencies:
```
pip install -r requirements.txt
```
Organize the dataset into data/train and data/test folders.

Train the model by running the Jupyter notebook:

cd notebooks
jupyter notebook Image_Classification.ipynb

Future Improvements

Transfer Learning: Implementing pre-trained models such as ResNet50, InceptionV3, or EfficientNet to boost model performance.
Fine-Tuning: Unfreezing some layers of the pre-trained models for fine-tuning on the specific dataset.
Hyperparameter Tuning: Experimenting with batch size, learning rate, and number of layers for better accuracy.
Data Augmentation: Further experimenting with data augmentation techniques like brightness and contrast shifts.

Feel free to fork this repository and contribute by creating pull requests!

Project 2: Forest Cover Type Prediction

Objective

The objective of this project is to build a machine learning model that predicts the type of forest cover in a given area using various environmental features. The dataset used for this project is derived from the Roosevelt National Forest in northern Colorado, provided by the forest department.

Problem Statement

We aim to classify the type of forest cover in a 30m x 30m patch of land into one of the seven forest cover types:

Spruce/Fir
Lodgepole Pine
Ponderosa Pine
Cottonwood/Willow
Aspen
Douglas-fir
Krummholz

The model will use 12 primary features, including elevation, slope, soil type, and other environmental variables, to make accurate predictions.

Dataset

The dataset consists of both categorical and numerical features. The key features include:

Elevation: Elevation in meters.
Aspect: Aspect in degrees azimuth.
Slope: Slope in degrees.
Horizontal and Vertical Distances: Distance to nearest hydrological, roadways, and fire points.
Hillshade: Hillshade index at various times of the day.
Wilderness Area: Binary columns indicating different wilderness areas.
Soil Type: Binary columns for different soil types.
Cover_Type: The target variable representing the forest cover type.

Project Workflow

Data Preprocessing: The dataset is preprocessed, including handling missing values (if any), feature scaling, and encoding categorical variables.
Exploratory Data Analysis (EDA): Initial exploration of the dataset to understand feature distributions and relationships.
Model Building: Various machine learning models are tested, including:
- Random Forest Classifier
- Support Vector Machine (SVM)
- XGBoost
Model Evaluation: Models are evaluated using metrics like accuracy, precision, recall, and a confusion matrix.
Model Saving: The trained model is saved using joblib for future use or deployment.

Installation

Requirements:

To run this project, you'll need to install the following libraries:

pip install numpy pandas scikit-learn matplotlib seaborn joblib

Running the Project:

Clone this repository:

git clone https://github.com/your_username/forest-cover-type-prediction.git

Upload your dataset to the working directory (e.g., forest_cover.csv).
Open the Jupyter notebook or Google Colab file and follow the steps provided in the notebook.

Project Structure

├── forest_cover_type_model.pkl     # Trained Random Forest model
├── forest_cover.csv                # Dataset used for training and testing
├── README.md                       # Project documentation
├── requirements.txt                # Required libraries
└── notebook.ipynb                  # Jupyter/Colab notebook for the project

Usage

1. Data Loading

First, load the dataset into a pandas DataFrame:

import pandas as pd
df = pd.read_csv('forest_cover.csv')

2. Data Preprocessing

Preprocess the dataset by scaling numerical features and splitting the data into training and test sets:

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# Separate features and target
X = df.drop('Cover_Type', axis=1)
y = df['Cover_Type']

# Train-test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Scale numerical features
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train.iloc[:, :10])  # Scale only numerical columns
X_test_scaled = scaler.transform(X_test.iloc[:, :10])

3. Model Training

Train the Random Forest Classifier:

from sklearn.ensemble import RandomForestClassifier

# Initialize and train the model
rf_model = RandomForestClassifier(n_estimators=100, random_state=42)
rf_model.fit(X_train_scaled, y_train)

4. Model Evaluation

Evaluate the model on test data:

from sklearn.metrics import accuracy_score, classification_report, confusion_matrix

# Predictions
y_pred = rf_model.predict(X_test_scaled)

# Accuracy and Classification Report
print(f'Accuracy: {accuracy_score(y_test, y_pred)}')
print(f'Classification Report:\n{classification_report(y_test, y_pred)}')

# Confusion Matrix
print(f'Confusion Matrix:\n{confusion_matrix(y_test, y_pred)}')

5. Save and Load the Model

Save the trained model for future use:

import joblib
joblib.dump(rf_model, 'forest_cover_type_model.pkl')

# Load the model for future predictions
loaded_model = joblib.load('forest_cover_type_model.pkl')

Model Evaluation

Accuracy: `XX.XX%`

The Random Forest model achieved an accuracy of XX.XX% on the test set.

Confusion Matrix:

[[TP, FP],
 [FN, TN]]

Classification Report:

Class	Precision	Recall	F1-score
Spruce/Fir	x.xx	x.xx	x.xx
Lodgepole Pine	x.xx	x.xx	x.xx
...	...	...	...

Contributors

Saurabh Yadav - GitHub Profile

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
ASL.ipynb		ASL.ipynb
Animal_Classification_Project.ipynb		Animal_Classification_Project.ipynb
Forest_Cover_Type_Prediction_Project.ipynb		Forest_Cover_Type_Prediction_Project.ipynb
Heart_Disease.ipynb		Heart_Disease.ipynb
README.md		README.md
Vehicle_price.ipynb		Vehicle_price.ipynb
fraud_2_final.ipynb		fraud_2_final.ipynb
liver.ipynb		liver.ipynb
lung_cancer.ipynb		lung_cancer.ipynb
mobile_pricing.ipynb		mobile_pricing.ipynb
thyroid.ipynb		thyroid.ipynb

SaurabhIndi/UM-Projects

Folders and files

Latest commit

History

Repository files navigation

!! EXCEPT ASL & ANIMAL CLASSIFICATION, ALL MODELS ARE RUNNING SUCCESSFULLY, THANK YOU FOR READING !!

UM-Projects

Project 1: Image Classification of Animals

Project Overview

Table of Contents

Dataset

Animal Classes:

Dataset Specifications:

Project Structure

Model Architecture

CNN Model

Transfer Learning Model (Using VGG16)

Training Process

Data Augmentation and Preprocessing

Model Training

Evaluation and Results

How to Run

Prerequisites:

Steps to Run:

Future Improvements

Project 2: Forest Cover Type Prediction

Objective

Problem Statement

Dataset

Project Workflow

Table of Contents

Installation

Requirements:

Running the Project:

Project Structure

Usage

1. Data Loading

2. Data Preprocessing

3. Model Training

4. Model Evaluation

5. Save and Load the Model

Model Evaluation

Accuracy: XX.XX%

Confusion Matrix:

Classification Report:

Contributors

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Accuracy: `XX.XX%`

Packages