-
Notifications
You must be signed in to change notification settings - Fork 66
Open
Description
Labels: middleware, monitoring, devops, medium-priority
Description:
Implement health check endpoints and middleware to monitor application status, dependencies, and readiness for production traffic.
Requirements:
Create multiple health check endpoints:
- /health (basic liveness check)
- /health/ready (readiness check - can serve traffic)
- /health/live (liveness check - application running)
- /health/detailed (comprehensive status - admin only)
Check status of critical dependencies:
- Database connectivity
- Redis cache connection
- External APIs (if critical)
- File system access
- Memory usage
Return appropriate HTTP status codes:
- 200 OK: All systems healthy
- 503 Service Unavailable: Critical failure
- 207 Multi-Status: Partial degradation
Include version information and uptime
Support dependency health caching (avoid overwhelming checks)
Provide detailed error messages in degraded state
Integrate with orchestration tools (Kubernetes, Docker)
Log health check failures
Support graceful shutdown signaling
Acceptance Criteria:
- Load balancers can determine instance health
- Kubernetes uses health checks for pod management
- Health checks complete in under 1 second
- Critical dependency failures return 503
- Non-critical failures return 200 with warnings
- Detailed health check shows all dependency statuses
- No excessive database/cache queries from health checks
- Version and build information included
- Uptime tracked and reported
- Graceful shutdown support (return 503 when shutting down)
Endpoint Specifications:
GET /health
- Simple liveness check
- Returns 200 if application is running
- No dependency checks
- Used by load balancers
GET /health/ready
- Readiness check
- Verifies database and cache connectivity
- Returns 200 if ready to serve traffic
- Returns 503 if not ready
GET /health/live
- Liveness check
- Application process running check
- Returns 200 if process alive
- Used by Kubernetes liveness probe
GET /health/detailed (Admin only)
- Comprehensive health status
- All dependency statuses
- Memory and CPU metrics
- Version and build info
- Uptime and request counts
Response Format:
{
status: "healthy" | "degraded" | "unhealthy",
version: "1.0.0",
uptime: 3600 (seconds),
timestamp: ISO8601,
checks: {
database: { status: "healthy", responseTime: 5 },
redis: { status: "healthy", responseTime: 2 },
memory: { status: "healthy", usage: 45% }
}
}Integration:
- Docker HEALTHCHECK directive
- Kubernetes liveness/readiness probes
- Load balancer health checks
- Monitoring systems (Datadog, New Relic)
Performance:
- Cache health check results (30 seconds)
- Async dependency checks
- Timeout individual checks (5 seconds max)
- Fail fast on critical failures
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels