A real-time streaming application that detects edit wars on Wikipedia using Apache Kafka, Spring Boot, React, and Docker.
Monitors the Wikimedia EventStreams API in real-time and detects patterns indicating edit wars - situations where multiple users repeatedly revert each other's changes on the same article.
Real Detection: Successfully detected edit wars on pages like Frederick Trump, Hans van Manen, and more! β
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β Wikimedia API ββββββΆβ Kafka Producer ββββββΆβ Apache Kafka β
βββββββββββββββββββ βββββββββββββββββββ ββββββββββ¬βββββββββ
β
βΌ
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β React Frontend βββββββ REST API βββββββ Kafka Consumer β
βββββββββββββββββββ βββββββββββββββββββ ββββββββββ¬βββββββββ
β
βΌ
βββββββββββββββββββ
β PostgreSQL β
βββββββββββββββββββ
| Component | Description |
|---|---|
| kafka-producer-api | Streams real-time Wikipedia edits to Kafka |
| kafka-consumer-api | Consumes events, detects edit wars, exposes REST API |
| React Frontend | Dashboard displaying real-time alerts (separate repo) |
- Spring Boot 3.5.6 - Application framework
- Apache Kafka (KRaft) - Event streaming (no ZooKeeper required)
- Spring WebFlux - Reactive programming & Server-Sent Events
- PostgreSQL 15 - Database persistence
- Spring Data JPA - ORM with Hibernate
- React 18 + TypeScript - Frontend dashboard
- Docker & Docker Compose - Containerization
- JUnit 5 & Mockito - Testing with TDD approach
The fastest way to run the entire stack:
- Docker and Docker Compose installed
- Git
# Clone repository
git clone https://github.com/YOUR_USERNAME/springboot-kafka-realtime.git
cd springboot-kafka-realtime
# Create environment file
cp .env.example .env
# (Optional) Edit .env to change database credentials# Build and start all services
docker-compose up --build
# Or run in background
docker-compose up --build -dThis starts:
- β PostgreSQL - Database with schema auto-initialized
- β Apache Kafka - Message broker (KRaft mode)
- β Producer - Streams Wikipedia events to Kafka
- β Consumer - Detects edit wars, serves REST API on port 8081
# Check all containers are running
docker-compose ps
# View logs
docker-compose logs -f
# Test the API
curl http://localhost:8081/api/health | jq
curl http://localhost:8081/api/stats | jq
curl http://localhost:8081/api/alerts | jqdocker-compose down
# To also remove the database volume (fresh start)
docker-compose down -vIf you prefer running services locally:
- Java 21+
- Apache Kafka 3.8+ (KRaft mode)
- PostgreSQL 15+
- Maven 3.8+
# Create database and user
psql -U postgres
CREATE DATABASE editwars_detection;
CREATE USER editwar_user WITH PASSWORD 'your_password';
GRANT ALL PRIVILEGES ON DATABASE editwars_detection TO editwar_user;
\c editwars_detection
# Run the schema migration
\i kafka-consumer-api/src/main/resources/db/migration/V1__init_schema.sql
\q# Download and extract Kafka
wget https://downloads.apache.org/kafka/3.8.0/kafka_2.13-3.8.0.tgz
tar -xzf kafka_2.13-3.8.0.tgz
cd kafka_2.13-3.8.0
# Generate cluster ID and format storage (first time only)
KAFKA_CLUSTER_ID="$(bin/kafka-storage.sh random-uuid)"
bin/kafka-storage.sh format -t $KAFKA_CLUSTER_ID -c config/kraft/server.properties
# Start Kafka
bin/kafka-server-start.sh config/kraft/server.properties# Build project
./mvnw clean install
# Start Consumer (in one terminal)
cd kafka-consumer-api
../mvnw spring-boot:run
# Start Producer (in another terminal)
cd kafka-producer-api
../mvnw spring-boot:runBase URL: http://localhost:8081/api
| Method | Endpoint | Description |
|---|---|---|
| GET | /health |
Health check |
| GET | /stats |
System statistics |
| GET | /alerts |
Get all alerts (paginated) |
| GET | /alerts/{id} |
Get specific alert |
| GET | /alerts/search?q={keyword} |
Search by page title |
| GET | /alerts/status/{status} |
Filter by status |
| GET | /alerts/severity/{level} |
Filter by severity |
| GET | /alerts/recent |
Recent active alerts |
| POST | /test/simulate-edit-war |
Simulate test data |
# Get statistics
curl http://localhost:8081/api/stats | jq
{
"totalAlerts": 12,
"activeAlerts": 12,
"resolvedAlerts": 0
}
# Search for alerts
curl "http://localhost:8081/api/alerts/search?q=trump" | jq
# Get high severity alerts
curl http://localhost:8081/api/alerts/severity/HIGH | jqAn edit war is detected when:
- β 5+ edits on the same article within 1 hour
- β 2-3 distinct human editors (bots excluded)
- β Main namespace only (articles, not talk pages)
- β 50%+ conflict ratio (reverts or opposing changes)
| Type | Description |
|---|---|
| Pure Reverts | Edit returns article to a previous length |
| Opposing Edits | One user adds content, another removes it |
| Level | Score | Description |
|---|---|---|
| CRITICAL | β₯0.8 | Intense, rapid conflict |
| HIGH | β₯0.6 | Significant edit war |
| MEDIUM | β₯0.4 | Moderate conflict |
| LOW | <0.4 | Minor disagreement |
Test-Driven Development (TDD) approach with comprehensive coverage:
# Run all tests
./mvnw test
# Run specific test suites
./mvnw test -Dtest=AlertServiceTest
./mvnw test -Dtest=AlertControllerTest
./mvnw test -Dtest=EditWarDetectionServiceTest
./mvnw test -Dtest=PageEditWindowTest- β Unit tests for services, repositories, mappers
- β Integration tests with H2 in-memory database
- β REST API tests with WebTestClient
- β Edit war detection algorithm tests
springboot-kafka-realtime/
βββ docker-compose.yml # Container orchestration
βββ .env.example # Environment template
βββ kafka-producer-api/ # Wikimedia β Kafka
β βββ Dockerfile
β βββ src/main/java/.../
β β βββ ApiRealTimeChangesProducer.java
β β βββ ApiRealTimeChangesHandler.java
β β βββ KafkaTopicConfig.java
β βββ src/main/resources/
β βββ application.properties
β βββ application-docker.properties
βββ kafka-consumer-api/ # Kafka β Detection β API
β βββ Dockerfile
β βββ src/main/java/.../
β β βββ controller/ # REST endpoints
β β βββ service/ # Business logic
β β βββ entity/ # Domain models
β β βββ persistence/ # Database layer
β βββ src/main/resources/
β βββ application.properties
β βββ application-docker.properties
β βββ db/migration/ # SQL schemas
βββ README.md
| Service | Image | Port | Description |
|---|---|---|---|
| postgres | postgres:15-alpine | 5433:5432 | Database |
| kafka | apache/kafka:latest | 9092:9092 | Message broker |
| producer | Custom build | - | Wikimedia streamer |
| consumer | Custom build | 8081:8081 | API server |
Create a .env file (see .env.example):
POSTGRES_DB=editwars_detection
POSTGRES_USER=editwar_user
POSTGRES_PASSWORD=your_secure_password# View logs for specific service
docker-compose logs -f consumer
# Rebuild single service
docker-compose up --build consumer
# Access PostgreSQL
docker exec -it editwars-postgres psql -U editwar_user -d editwars_detection
# Check Kafka topics
docker exec -it editwars-kafka /opt/kafka/bin/kafka-topics.sh --list --bootstrap-server localhost:9092- β Real-time processing - Processes Wikipedia edits as they happen
- β Pattern recognition - Sophisticated conflict detection algorithm
- β Reactive architecture - Non-blocking I/O with Spring WebFlux
- β Database persistence - PostgreSQL with JPA/Hibernate
- β RESTful API - Comprehensive endpoints with pagination
- β Containerized - One-command deployment with Docker Compose
- β Test-driven - Extensive test coverage
- β Production-ready - Error handling, logging, health checks
# PostgreSQL conflict (if running locally)
# Change docker-compose.yml: "5433:5432" instead of "5432:5432"
# Or stop local PostgreSQL
sudo systemctl stop postgresql# Check producer logs
docker-compose logs -f producer
# Verify Kafka is receiving messages
docker exec -it editwars-kafka /opt/kafka/bin/kafka-console-consumer.sh \
--bootstrap-server localhost:9092 \
--topic wikimedia-stream-api \
--from-beginning# Reset database (removes all data)
docker-compose down -v
docker-compose up --buildThis is normal! Real edit wars are rare (~0.01% of edits). Use test endpoints:
curl -X POST http://localhost:8081/api/test/simulate-edit-war | jq- Repository Pattern (data access)
- Mapper Pattern (DTO conversion)
- Observer Pattern (event-driven)
- Builder Pattern (object construction)
- Clean Architecture / Layered Architecture
- Separation of Concerns
- Dependency Inversion
- Single Responsibility
- Test-Driven Development (TDD)
- Spring Profiles for environment configuration
- Docker multi-stage builds
- Health checks for container orchestration
MIT License - See LICENSE file for details
Eugene Paitoo
β Star this repo if you find it useful!
This project demonstrates real-time stream processing, event-driven architecture, containerization, and production-grade Java development practices.