Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 35 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,28 +32,54 @@ npm start

## Scripts

| Script | Description |
|----------|--------------------------------|
| `npm run build` | Compile TypeScript to `dist/` |
| `npm run start` | Run production server |
| `npm run dev` | Run with ts-node-dev |
| `npm test` | Run Jest tests |
| `npm run lint` | Run ESLint |
| Script | Description |
|---|---|
| `npm run build` | Compile TypeScript to `dist/` |
| `npm run start` | Run production server |
| `npm run dev` | Run with ts-node-dev |
| `npm test` | Run Jest tests |
| `npm run lint` | Run ESLint |

## API and Observability Endpoints

- `GET /api/v1/contracts`: Sample contracts endpoint.
- `GET /health/live`: Liveness signal for process-level checks.
- `GET /health/ready`: Readiness signal with runtime and dependency health details.
- `GET /health`: Full service-level health report (same payload as readiness).
- `GET /metrics`: Prometheus metrics in text exposition format.

## Environment Variables

| Variable | Default | Description |
|---|---|---|
| `PORT` | `3001` | HTTP port used by the service |
| `SERVICE_NAME` | `talenttrust-backend` | Service label used in health and metrics |
| `METRICS_ENABLED` | `true` | Enables/disables `/metrics` route |
| `METRICS_AUTH_TOKEN` | _unset_ | If set, `/metrics` requires `Authorization: Bearer <token>` |

## Security Notes

- Keep `METRICS_AUTH_TOKEN` set in production unless `/metrics` is only reachable on trusted internal networks.
- `/metrics` intentionally excludes request bodies and unbounded labels to reduce accidental data leakage and high-cardinality abuse.
- Health payloads include operational signals only (event loop lag, memory pressure, dependency states), not secrets.

See `docs/backend/observability.md` for endpoint samples, threat scenarios, and scrape recommendations.

## Contributing

1. Fork the repo and create a branch from `main`.
2. Install deps, run tests and build: `npm install && npm test && npm run build`.
3. Open a pull request. CI runs build (and tests when present) on push/PR to `main`.
3. Open a pull request. CI runs build and tests on push/PR to `main`.

## CI/CD

GitHub Actions runs on push and pull requests to `main`:

- Install dependencies
- Build the project (`npm run build`)
- Run tests (`npm test`)

Keep the build passing before merging.
Keep CI passing before merging.

## License

Expand Down
107 changes: 107 additions & 0 deletions docs/backend/observability.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
# Backend Observability

This document explains service-level health signaling and Prometheus metrics exposure in `Talenttrust-Backend`.

## Endpoints

### `GET /health/live`

Returns process liveness only.

```json
{
"status": "up",
"service": "talenttrust-backend"
}
```

### `GET /health` and `GET /health/ready`

Returns service-level health with runtime and dependency signals.

```json
{
"service": "talenttrust-backend",
"status": "up",
"timestamp": "2026-03-24T00:00:00.000Z",
"uptimeSeconds": 102.34,
"signals": {
"eventLoopLagMs": 12,
"heapUsedBytes": 23893648,
"heapTotalBytes": 30523392,
"heapUsedRatio": 0.78
},
"dependencies": []
}
```

Status behavior:

- `up`: all local and dependency checks are healthy.
- `degraded`: one or more checks are elevated but still serving.
- `down`: one or more checks are critical. HTTP status is `503`.

### `GET /metrics`

Exposes metrics in Prometheus text format.

If `METRICS_AUTH_TOKEN` is set, requests must include:

```text
Authorization: Bearer <token>
```

If auth is missing/invalid, route returns `401`.

## Configuration

| Variable | Default | Notes |
|---|---|---|
| `PORT` | `3001` | API listener port |
| `SERVICE_NAME` | `talenttrust-backend` | Name used in health payload and metrics labels |
| `METRICS_ENABLED` | `true` | Set to `false` to return `404` on `/metrics` |
| `METRICS_AUTH_TOKEN` | _unset_ | Enables bearer-token protection for `/metrics` |

## Exported Prometheus Metrics

- `http_requests_total{method,route,status_code}`
- `http_request_duration_seconds{method,route,status_code}`
- `service_health_status{service}` (`up=2`, `degraded=1`, `down=0`)
- Node/process default metrics from `prom-client` (prefixed by `<service>_`)

## Security and Threat Notes

### Threat: unauthorized scraping of operational details

Mitigation:

- Token-gate `/metrics` with `METRICS_AUTH_TOKEN`.
- Restrict route at network boundary (ingress, WAF, service mesh) to trusted scrapers only.

### Threat: high cardinality metrics causing memory growth

Mitigation:

- Route labels use bounded path values from Express route templates.
- No request payload, IDs, or user-provided fields are added as labels.

### Threat: health endpoint leaking secrets

Mitigation:

- Health responses contain only runtime capacity indicators and dependency status.
- Avoid including credentials or raw stack traces in dependency details.

## Prometheus scrape example

```yaml
scrape_configs:
- job_name: talenttrust-backend
metrics_path: /metrics
static_configs:
- targets: ['talenttrust-backend:3001']
authorization:
type: Bearer
credentials: ${METRICS_AUTH_TOKEN}
```

Loading
Loading