This groundbreaking project seamlessly fuses computer vision, speech recognition, and real-time alert systems to detect aggressive human interactions with unprecedented precision and instant notification.
It debuted at the Shilin Commercial High School Data Processing Department Project Competition, where it achieved the coveted First Place, earning accolades for its visionary innovation and technical mastery π.
- Real-Time Human Pose Estimation β Leveraging the cutting-edge YOLOv8n, the system continuously analyzes live camera feeds with unparalleled precision.
- Aggression Scoring & Contact Tracking β Quantifies short-term physical interactions and computes aggression metrics to minimize false positives, ensuring highly reliable detection.
- Integrated Voice Command Detection β Captures speech via microphone using advanced
SpeechRecognition; strategically defined key phrases instantly trigger alerts. - Immediate Local TTS Notifications β Employs
pyttsx3to deliver real-time voice alerts on-site for instantaneous response. - Discord Webhook Integration β Seamlessly transmits snapshots, aggression scores, and metadata to a dedicated Discord channel for comprehensive remote monitoring.
- Evidence Archiving β Systematically stores screenshots and event logs, supporting auditing, verification, and advanced analytics.
- Extensible Flask API β Provides robust endpoints for health checks, log retrieval, and future integrations, ensuring scalability and professional-grade system interoperability.
- Core Model: YOLOv8n (Ultralytics)
- Computer Vision: OpenCV
- Speech-to-Text: SpeechRecognition
- Text-to-Speech: pyttsx3
- Backend & API: Flask (optional)
- Notifications: Discord Webhook
flowchart TD
A[Camera Input] --> B[OpenCV Preprocessing]
B --> C[YOLOv8 Pose Estimation]
C --> D[Contact Detection & Tracking]
D --> E[Aggression Scoring]
E -->|Score > Threshold| F[Alert Trigger]
A2[Microphone Input] --> B2[SpeechRecognition]
B2 -->|Keyword Detected| F
F --> G[Save Evidence]
F --> H[Discord Webhook Notification]
F --> I[Local TTS Warning]
H --> J[Remote Monitoring Channel]
- 1st Place β Shilin Commercial High School Data Processing Department Project Competition
- Successfully demonstrated real-time aggression detection and alerting in a live competition setting.
- Recognized for its practical application, multi-modal design, and technical robustness.
- A human pose estimation network based on YOLOv8 framework with efficient multi-scale receptive field and expanded feature pyramid network β Scientific Reports
- Automated violence monitoring system for real-time fist fight detection using deep learning-based temporal action localization β Scientific Reports
- Pose Estimation β Ultralytics YOLO Docs
- How to receive webhooks in Python with Flask or Django β LogRocket Blog
- How to Set Up Python Webhooks: 3 Simple Steps β Hevo Data
- STEAM ζθ²εΈηΏηΆ²οΌOpenCV ζεΈη΄’εΌ
- You Only Look Once: Unified, Real-Time Object Detection (CVPR 2016)