Name	Name	Last commit message	Last commit date
parent directory ..
Deep_Learning_for_Sentiment_Image_Classification (1).ipynb	Deep_Learning_for_Sentiment_Image_Classification (1).ipynb
NEW_DOMAIN_6_Emotions.zip	NEW_DOMAIN_6_Emotions.zip
README.md	README.md

Deep Learning for Sentiment Image Classification

This project applies convolutional neural networks (CNNs) to classify facial images into six emotion categories — Happy, Sad, Fear, Pain, Anger, and Disgust.
It focuses on developing a compact, interpretable, and generalisable model for sentiment understanding using deep learning, with emphasis on augmentation strategies, capacity control, and domain adaptation.

The dataset used in this project is collected from multiple online sources and is available at Kaggle and New Domain 6 Emotions

Project Overview

The goal is to build a deep learning pipeline that can:

Accurately predict human emotions from facial imagery.
Handle small, imbalanced datasets through augmentation and class weighting.
Improve generalisation with targeted data transformations and adaptive fine-tuning.
Benchmark performance under domain shift scenarios.

Dataset Summary

Emotion	Sample Count	Share (%)
Happy	230	20.0
Sad	224	19.5
Anger	214	18.6
Pain	162	14.1
Disgust	159	13.8
Fear	159	13.8

Dataset size: ~1.15K labeled images.
Class imbalance mitigated using WeightedRandomSampler and weighted loss.

Experiments & Results

1️⃣ Baseline Model — SmallCNN

Parameters: 389,958
Architecture: 4 Conv–ReLU–MaxPool blocks + GlobalAvgPool + Dropout + Linear
Input size: 160×160×3
Optimizer: AdamW (LR=1e-3, WD=1e-4)
Loss: Weighted Cross-Entropy + Label Smoothing (0.05)
Train/Val/Test Split: 70/15/15

Metric	Accuracy	Macro Precision	Macro Recall	Macro F1
SmallCNN (Baseline)	0.1734	0.1320	0.2042	0.1272

The baseline performance barely exceeds random chance (≈16.7%), suggesting insufficient capacity and limited invariance to facial variations.

2️⃣ Data Augmentation — Standard Policy

Augmentations: random resized crop, flip, small rotation (±10°), mild brightness/contrast jitter.
All else constant between augmented and non-augmented runs.

Variant	Accuracy	Macro Precision	Macro Recall	Macro F1
No Augmentation	0.2197	0.1868	0.2366	0.1946
With Augmentation	0.2023	0.0745	0.2431	0.1126

While augmentation improved recall for minority classes, it decreased precision and overall F1, indicating the need to fine-tune augmentation strength.

3️⃣ Model Capacity — SmallCNN vs WiderCNN

Model	Parameters	Accuracy	Macro Precision	Macro Recall	Macro F1
SmallCNN	390K	0.2081	0.1556	0.2083	0.1334
WiderCNN	764K	0.2197	0.2023	0.2161	0.1585

Observation: WiderCNN outperformed SmallCNN in macro metrics (+0.025 F1), validating moderate capacity scaling for small datasets.

4️⃣ Targeted Augmentation Refinement

After error analysis (e.g., Sad ↔ Fear, Pain ↔ Anger confusions), grayscale jitter and translation were added to simulate lighting and pose variability.

Metric	Before	After
Accuracy	0.1734	0.2081
Macro Precision	0.1320	0.1885
Macro Recall	0.2042	0.2478
Macro F1	0.1272	0.1483

Targeted augmentation increased macro-F1 by 16.6%, reducing bias across emotional classes.

Before Refinement

After Refinement

5️⃣ Cross-Domain Generalisation

Evaluated model performance on a new dataset with different lighting and demographics.

Variant	Accuracy	Macro Precision	Macro Recall	Macro F1
Baseline	0.1255	0.0417	0.1294	0.0586
TTA (flip averaging)	0.1255	0.0417	0.1294	0.0586
Few-Shot Adapt (10 imgs/class)	0.1365	0.0476	0.1386	0.0645

Few-shot fine-tuning improved macro F1 by ~10%, proving that limited adaptation can recover domain performance under severe shift.

⚠️ Notes

Please be advised that the reported results are approximate and may vary slightly between runs due to:

Random weight initialisation
Stochastic data loading and augmentation
Hardware and runtime variability (especially on GPU environments)

Each experiment was executed under the same configuration, but reproducibility across environments may show ±2–3% metric deviation.

Tech Stack

Category	Tools / Libraries
Language	Python 3.12
Frameworks	PyTorch, Torchvision
Data Processing	Pandas, NumPy
Visualization	Matplotlib, Seaborn
Metrics	scikit-learn
Environment	Google Colab / Local GPU

Key Insights

WiderCNN achieved the best overall balance with a 0.1585 Macro-F1, outperforming the smaller baseline.
Error-driven augmentation refined the augmentation policy and improved F1 by ~16%.
Few-shot adaptation boosted new-domain F1 from 0.0586 → 0.0645, confirming adaptability potential.
Despite improvements, the task remains challenging due to high inter-class similarity and limited data diversity.

Future Work

Incorporate transfer learning from pretrained face recognition models (e.g., ResNet-18, VGGFace2).
Experiment with Vision Transformers (ViT) for global emotion context.
Integrate domain-adversarial training for better robustness under distribution shift.
Expand dataset diversity and use synthetic data for underrepresented classes.

Author

Author: [Frank Dinh]
Email: [dinh.qnhat@gmail.com]
Year: 2025

© 2025 All Rights Reserved. Unauthorized reuse or redistribution is prohibited.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Deep Learning for Sentiment Image Classification

Project Overview

Dataset Summary