Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -231,7 +231,7 @@ For distributed development:
3. Build Store: `mvn clean package -pl hugegraph-store -am -DskipTests`
4. Build Server with HStore backend: `mvn clean package -pl hugegraph-server -am -DskipTests`

See Docker Compose example: `hugegraph-server/hugegraph-dist/docker/example/`
See Docker Compose examples: `docker/` directory. Single-node quickstart (pre-built images): `docker/docker-compose.yml`. Single-node dev build (from source): `docker/docker-compose.dev.yml`. 3-node cluster: `docker/docker-compose-3pd-3store-3server.yml`. See `docker/README.md` for full setup guide.

### Debugging Tips

Expand Down
15 changes: 10 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -173,7 +173,7 @@ flowchart TB
### 5 Minutes Quick Start

```bash
# Start HugeGraph with Docker
# Start HugeGraph (standalone mode)
docker run -itd --name=hugegraph -p 8080:8080 hugegraph/hugegraph:1.7.0

# Verify server is running
Expand Down Expand Up @@ -208,13 +208,18 @@ docker run -itd --name=hugegraph -e PASSWORD=your_password -p 8080:8080 hugegrap
```

For advanced Docker configurations, see:
- [Docker Documentation](https://hugegraph.apache.org/docs/quickstart/hugegraph-server/#3-deploy)
- [Docker Compose Example](./hugegraph-server/hugegraph-dist/docker/example)
- [Docker README](hugegraph-server/hugegraph-dist/docker/README.md)

* [Docker Documentation](https://hugegraph.apache.org/docs/quickstart/hugegraph-server/#3-deploy)
* [Docker Compose Examples](./docker/)
* [Docker README](./docker/README.md)
* [Server Docker README](hugegraph-server/hugegraph-dist/docker/README.md)

> **Docker Desktop (Mac/Windows)**: The 3-node distributed cluster (`docker/docker-compose-3pd-3store-3server.yml`) uses Docker bridge networking and works on all platforms including Docker Desktop. Allocate at least 12 GB memory to Docker Desktop.

> **Note**: Docker images are convenience releases, not **official ASF distribution artifacts**. See [ASF Release Distribution Policy](https://infra.apache.org/release-distribution.html#dockerhub) for details.
>
> **Version Tags**: Use release tags (`1.7.0`, `1.x.0`) for stable versions. Use `latest` for development features.
> **Version Tags**: Use release tags (e.g., `1.7.0`) for stable deployments. The `latest` tag should only be used for testing or development.


<details>
<summary><b>Option 2: Download Binary Package</b></summary>
Expand Down
265 changes: 265 additions & 0 deletions docker/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,265 @@
# HugeGraph Docker Deployment

This directory contains Docker Compose files for running HugeGraph:

| File | Description |
|------|-------------|
| `docker-compose.yml` | Single-node cluster using pre-built images from Docker Hub |
| `docker-compose.dev.yml` | Single-node cluster built from source (for developers) |
| `docker-compose-3pd-3store-3server.yml` | 3-node distributed cluster (PD + Store + Server) |

## Prerequisites

- **Docker Engine** 20.10+ (or Docker Desktop 4.x+)
- **Docker Compose** v2 (included in Docker Desktop)
- **Memory**: Allocate at least **12 GB** to Docker Desktop (Settings → Resources → Memory). The 3-node cluster runs 9 JVM processes (3 PD + 3 Store + 3 Server) which are memory-intensive. Insufficient memory causes OOM kills that appear as silent Raft failures.

> [!IMPORTANT]
> The 12 GB minimum is for Docker Desktop (Mac/Windows). On Linux with native Docker, ensure the host has at least 12 GB of free memory.

## Why Bridge Networking (Not Host Mode)

Previous versions used `network_mode: host`, which only works on Linux and is incompatible with Docker Desktop on Mac/Windows. The cluster now uses a proper Docker bridge network (`hg-net`) where services communicate via container hostnames (`pd0`, `pd1`, `store0`, etc.) instead of `127.0.0.1`. This makes the cluster portable across all platforms.

---

## Single-Node Setup

Two compose files are available for running a single-node cluster (1 PD + 1 Store + 1 Server):

### Option A: Quick Start (pre-built images)

Uses pre-built images from Docker Hub. Best for **end users** who want to run HugeGraph quickly.

```bash
cd docker
docker compose up -d
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This quick-start command still resolves to latest unless the user exports HUGEGRAPH_VERSION, but the bullets above claim the compose file is using 1.7.0 images. That mismatch makes the example non-reproducible as written. If the intent is to run the release tag, please set the version in the command:

Suggested change
docker compose up -d
HUGEGRAPH_VERSION=1.7.0 docker compose up -d

```

Comment on lines +34 to +38
Copy link

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The quickstart instructions don’t mention that the repo’s compose files currently mount docker-entrypoint.sh scripts from the source tree as a temporary workaround (see the TODO in docker-compose.yml). Without calling this out, users may assume the Docker Hub images alone are sufficient to support the HG_* env var injection. Please add an explicit IMPORTANT note describing the temporary entrypoint-mount requirement and when it can be removed once images are updated.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

@bitflicker64 bitflicker64 Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These comments are no longer applicable the Docker Hub images were updated on March 21, 2026 with the new entrypoints baked in, and the temporary volume mount workaround has been removed in a follow-up cleanup PR #2976 . The HG_* env var injection works with the current published images without any additional steps.

- Images: `hugegraph/pd:1.7.0`, `hugegraph/store:1.7.0`, `hugegraph/server:1.7.0`
- `pull_policy: always` — always pulls the specified image tag

> **Note**: Use release tags (e.g., `1.7.0`) for stable deployments. The `latest` tag is intended for testing or development only.
- PD healthcheck endpoint: `/v1/health`
- Single PD, single Store (`HG_PD_INITIAL_STORE_LIST: store:8500`), single Server
- Server healthcheck endpoint: `/versions`

Comment on lines +30 to +46
Copy link

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In “Option A: Quick Start (pre-built images)”, the docs imply the compose file works purely from Docker Hub images, but docker/docker-compose.yml currently relies on volume-mounting the new docker-entrypoint.sh scripts from the repository (see the ../hugegraph-*/.../docker-entrypoint.sh mounts). Please document this temporary workaround and that users must run compose from a checked-out repo until the images ship the updated entrypoints (or update the compose/docs to not require host mounts).

Copilot uses AI. Check for mistakes.
### Option B: Development Build (build from source)

Builds images locally from source Dockerfiles. Best for **developers** who want to test local changes.

```bash
cd docker
docker compose -f docker-compose.dev.yml up -d
```

- Images: built from source via `build: context: ..` with Dockerfiles
- No `pull_policy` — builds locally, doesn't pull
- Entrypoint scripts are baked into the built image (no volume mounts)
- PD healthcheck endpoint: `/v1/health`
- Otherwise identical env vars and structure to the quickstart file

### Key Differences

| | `docker-compose.yml` (quickstart) | `docker-compose.dev.yml` (dev build) |
|---|---|---|
| **Images** | Pull from Docker Hub | Build from source |
| **Who it's for** | End users | Developers |
| **pull_policy** | `always` | not set (build) |

**Verify** (both options):
```bash
curl http://localhost:8080/versions
```

---

## 3-Node Cluster Quickstart

```bash
cd docker
docker compose -f docker-compose-3pd-3store-3server.yml up -d

# To stop and remove all data volumes (clean restart)
docker compose -f docker-compose-3pd-3store-3server.yml down -v
```

**Startup ordering** is enforced via `depends_on` with `condition: service_healthy`:

1. **PD nodes** start first and must pass healthchecks (`/v1/health`)
2. **Store nodes** start after all PD nodes are healthy
3. **Server nodes** start after all Store nodes are healthy

This ensures PD and Store are healthy before the server starts. The server entrypoint still performs a best-effort partition wait after launch, so partition assignment may take a little longer.

**Verify the cluster is healthy**:

```bash
# Check PD health
curl http://localhost:8620/v1/health

# Check Store health
curl http://localhost:8520/v1/health

# Check Server (Graph API)
curl http://localhost:8080/versions

# List registered stores via PD
curl http://localhost:8620/v1/stores

# List partitions
curl http://localhost:8620/v1/partitions
```

---

## Environment Variable Reference

Configuration is injected via environment variables. The old `docker/configs/application-pd*.yml` and `docker/configs/application-store*.yml` files are no longer used.

### PD Environment Variables

| Variable | Required | Default | Maps To (`application.yml`) | Description |
|----------|----------|---------|-----------------------------|-------------|
| `HG_PD_GRPC_HOST` | Yes | — | `grpc.host` | This node's hostname/IP for gRPC |
| `HG_PD_RAFT_ADDRESS` | Yes | — | `raft.address` | This node's Raft address (e.g. `pd0:8610`) |
| `HG_PD_RAFT_PEERS_LIST` | Yes | — | `raft.peers-list` | All PD peers (e.g. `pd0:8610,pd1:8610,pd2:8610`) |
| `HG_PD_INITIAL_STORE_LIST` | Yes | — | `pd.initial-store-list` | Expected stores (e.g. `store0:8500,store1:8500,store2:8500`) |
| `HG_PD_GRPC_PORT` | No | `8686` | `grpc.port` | gRPC server port |
| `HG_PD_REST_PORT` | No | `8620` | `server.port` | REST API port |
| `HG_PD_DATA_PATH` | No | `/hugegraph-pd/pd_data` | `pd.data-path` | Metadata storage path |
| `HG_PD_INITIAL_STORE_COUNT` | No | `1` | `pd.initial-store-count` | Min stores for cluster availability |

**Deprecated aliases** (still work but log a warning):

| Deprecated | Use Instead |
|------------|-------------|
| `GRPC_HOST` | `HG_PD_GRPC_HOST` |
| `RAFT_ADDRESS` | `HG_PD_RAFT_ADDRESS` |
| `RAFT_PEERS` | `HG_PD_RAFT_PEERS_LIST` |
| `PD_INITIAL_STORE_LIST` | `HG_PD_INITIAL_STORE_LIST` |

### Store Environment Variables

| Variable | Required | Default | Maps To (`application.yml`) | Description |
|----------|----------|---------|-----------------------------|-------------|
| `HG_STORE_PD_ADDRESS` | Yes | — | `pdserver.address` | PD gRPC addresses (e.g. `pd0:8686,pd1:8686,pd2:8686`) |
| `HG_STORE_GRPC_HOST` | Yes | — | `grpc.host` | This node's hostname (e.g. `store0`) |
| `HG_STORE_RAFT_ADDRESS` | Yes | — | `raft.address` | This node's Raft address (e.g. `store0:8510`) |
| `HG_STORE_GRPC_PORT` | No | `8500` | `grpc.port` | gRPC server port |
| `HG_STORE_REST_PORT` | No | `8520` | `server.port` | REST API port |
| `HG_STORE_DATA_PATH` | No | `/hugegraph-store/storage` | `app.data-path` | Data storage path |

**Deprecated aliases** (still work but log a warning):

| Deprecated | Use Instead |
|------------|-------------|
| `PD_ADDRESS` | `HG_STORE_PD_ADDRESS` |
| `GRPC_HOST` | `HG_STORE_GRPC_HOST` |
| `RAFT_ADDRESS` | `HG_STORE_RAFT_ADDRESS` |

### Server Environment Variables

| Variable | Required | Default | Maps To | Description |
|----------|----------|---------|-----------------------------|-------------|
| `HG_SERVER_BACKEND` | Yes | — | `backend` in `hugegraph.properties` | Storage backend (e.g. `hstore`) |
| `HG_SERVER_PD_PEERS` | Yes | — | `pd.peers` | PD cluster addresses (e.g. `pd0:8686,pd1:8686,pd2:8686`) |
| `STORE_REST` | No | — | Used by `wait-partition.sh` | Store REST endpoint for partition verification (e.g. `store0:8520`) |
| `PASSWORD` | No | — | Enables auth mode | Optional authentication password |

**Deprecated aliases** (still work but log a warning):

| Deprecated | Use Instead |
|------------|-------------|
| `BACKEND` | `HG_SERVER_BACKEND` |
| `PD_PEERS` | `HG_SERVER_PD_PEERS` |

---

## Port Reference

The table below reflects the published host ports in `docker-compose-3pd-3store-3server.yml`.
The single-node compose file (`docker-compose.yml`) only publishes the REST/API ports (`8620`, `8520`, `8080`) by default.

| Service | Container Port | Host Port | Protocol | Purpose |
|---------|---------------|-----------|----------|---------|
| pd0 | 8620 | 8620 | HTTP | REST API |
| pd0 | 8686 | 8686 | gRPC | PD gRPC |
| pd0 | 8610 | — | TCP | Raft (internal only) |
| pd1 | 8620 | 8621 | HTTP | REST API |
| pd1 | 8686 | 8687 | gRPC | PD gRPC |
| pd2 | 8620 | 8622 | HTTP | REST API |
| pd2 | 8686 | 8688 | gRPC | PD gRPC |
| store0 | 8500 | 8500 | gRPC | Store gRPC |
| store0 | 8510 | 8510 | TCP | Raft |
| store0 | 8520 | 8520 | HTTP | REST API |
| store1 | 8500 | 8501 | gRPC | Store gRPC |
| store1 | 8510 | 8511 | TCP | Raft |
| store1 | 8520 | 8521 | HTTP | REST API |
| store2 | 8500 | 8502 | gRPC | Store gRPC |
| store2 | 8510 | 8512 | TCP | Raft |
| store2 | 8520 | 8522 | HTTP | REST API |
| server0 | 8080 | 8080 | HTTP | Graph API |
| server1 | 8080 | 8081 | HTTP | Graph API |
| server2 | 8080 | 8082 | HTTP | Graph API |

---

## Healthcheck Endpoints

| Service | Endpoint | Expected |
|---------|----------|----------|
| PD | `GET /v1/health` | `200 OK` |
| Store | `GET /v1/health` | `200 OK` |
| Server | `GET /versions` | `200 OK` with version JSON |

---

## Troubleshooting

### Containers Exiting or Restarting (OOM Kills)

**Symptom**: Containers exit with code 137, or restart loops. Raft logs show election timeouts.

**Cause**: Docker Desktop does not have enough memory. The 9 JVM processes require at least 12 GB.

**Fix**: Docker Desktop → Settings → Resources → Memory → set to **12 GB** or higher. Restart Docker Desktop.

```bash
# Check if containers were OOM killed
docker inspect hg-pd0 | grep -i oom
docker stats --no-stream
```

### Raft Leader Election Failure

**Symptom**: PD logs show repeated `Leader election timeout`. Store nodes cannot register.

**Cause**: PD nodes cannot reach each other on the Raft port (8610), or `HG_PD_RAFT_PEERS_LIST` is misconfigured.

**Fix**:
1. Verify all PD containers are running: `docker compose -f docker-compose-3pd-3store-3server.yml ps`
2. Check PD logs: `docker logs hg-pd0`
3. Verify network connectivity: `docker exec hg-pd0 ping pd1`
4. Ensure `HG_PD_RAFT_PEERS_LIST` is identical on all PD nodes

### Partition Assignment Not Completing

**Symptom**: Server starts but graph operations fail. Store logs show `partition not found`.

**Cause**: PD has not finished assigning partitions to stores, or stores did not register successfully.

**Fix**:
1. Check registered stores: `curl http://localhost:8620/v1/stores`
2. Check partition status: `curl http://localhost:8620/v1/partitions`
3. Wait for partition assignment (can take 1–3 minutes after all stores register)
4. Check server logs for the `wait-partition.sh` script output: `docker logs hg-server0`

### Connection Refused Errors

**Symptom**: Stores cannot connect to PD, or Server cannot connect to Store.

**Cause**: Services are using `127.0.0.1` instead of container hostnames, or the `hg-net` bridge network is misconfigured.

**Fix**: Ensure all `HG_*` env vars use container hostnames (`pd0`, `store0`, etc.), not `127.0.0.1` or `localhost`.

4 changes: 2 additions & 2 deletions hugegraph-pd/AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -247,7 +247,7 @@ store:
### Common Configuration Errors

1. **Raft peer discovery failure**: `raft.peers-list` must include all PD nodes' `raft.address` values
2. **Store connection issues**: `grpc.host` must be a reachable IP (not `127.0.0.1`) for distributed deployments
2. **Store connection issues**: `grpc.host` must be a reachable IP (not `127.0.0.1`) for distributed deployments. In Docker bridge networking, use the container hostname (e.g., `pd0`) set via `HG_PD_GRPC_HOST` env var.
3. **Split-brain scenarios**: Always run 3 or 5 PD nodes in production for Raft quorum
4. **Partition imbalance**: Adjust `patrol-interval` for faster/slower rebalancing

Expand Down Expand Up @@ -331,7 +331,7 @@ docker run -d -p 8620:8620 -p 8686:8686 -p 8610:8610 \
hugegraph-pd:latest

# For production clusters, use Docker Compose or Kubernetes
# See: hugegraph-server/hugegraph-dist/docker/example/
# See: ../docker/docker-compose-3pd-3store-3server.yml and ../docker/README.md
```

Exposed ports: 8620 (REST), 8686 (gRPC), 8610 (Raft)
Expand Down
41 changes: 37 additions & 4 deletions hugegraph-pd/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -154,6 +154,36 @@ raft:

For detailed configuration options and production tuning, see [Configuration Guide](docs/configuration.md).

#### Docker Bridge Network Example

When running PD in Docker with bridge networking (e.g., `docker/docker-compose-3pd-3store-3server.yml`), configuration is injected via environment variables instead of editing `application.yml` directly. Container hostnames are used instead of IP addresses:

**pd0** container:
```bash
HG_PD_GRPC_HOST=pd0
HG_PD_RAFT_ADDRESS=pd0:8610
HG_PD_RAFT_PEERS_LIST=pd0:8610,pd1:8610,pd2:8610
HG_PD_INITIAL_STORE_LIST=store0:8500,store1:8500,store2:8500
```

**pd1** container:
```bash
HG_PD_GRPC_HOST=pd1
HG_PD_RAFT_ADDRESS=pd1:8610
HG_PD_RAFT_PEERS_LIST=pd0:8610,pd1:8610,pd2:8610
HG_PD_INITIAL_STORE_LIST=store0:8500,store1:8500,store2:8500
```

**pd2** container:
```bash
HG_PD_GRPC_HOST=pd2
HG_PD_RAFT_ADDRESS=pd2:8610
HG_PD_RAFT_PEERS_LIST=pd0:8610,pd1:8610,pd2:8610
HG_PD_INITIAL_STORE_LIST=store0:8500,store1:8500,store2:8500
```

See [docker/README.md](../docker/README.md) for the full environment variable reference.

### Verify Deployment

Check if PD is running:
Expand Down Expand Up @@ -203,22 +233,25 @@ Build PD Docker image:

```bash
# From project root
docker build -f hugegraph-pd/Dockerfile -t hugegraph-pd:latest .
docker build -f hugegraph-pd/Dockerfile -t hugegraph-pd:1.7.0 .

# Run container
docker run -d \
-p 8620:8620 \
-p 8686:8686 \
-p 8610:8610 \
-v /path/to/conf:/hugegraph-pd/conf \
-e HG_PD_GRPC_HOST=<your-ip> \
-e HG_PD_RAFT_ADDRESS=<your-ip>:8610 \
-e HG_PD_RAFT_PEERS_LIST=<your-ip>:8610 \
-e HG_PD_INITIAL_STORE_LIST=<store-ip>:8500 \
-v /path/to/data:/hugegraph-pd/pd_data \
--name hugegraph-pd \
hugegraph-pd:latest
hugegraph/pd:1.7.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The build and run examples are using two different image names (hugegraph-pd:1.7.0 vs hugegraph/pd:1.7.0). As written, a local docker build cannot be run by the next command without an extra docker tag step. Please make the tag consistent with the runtime image name used by the Docker Compose files, for example:

Suggested change
hugegraph/pd:1.7.0
docker build -f hugegraph-pd/Dockerfile -t hugegraph/pd:1.7.0 .

```

For Docker Compose examples with HugeGraph Store and Server, see:
```
hugegraph-server/hugegraph-dist/docker/example/
docker/docker-compose-3pd-3store-3server.yml
```

## Documentation
Expand Down
Loading
Loading