Skip to content

Conversation

@misiugodfrey
Copy link
Contributor

@misiugodfrey misiugodfrey commented Nov 24, 2025

Support in docker containers to run multiple workers (each with a separate pinned gpu) on a single machine. You can now run multiple workers (up to 4) controlled through the NUM_WORKERS env variable.

I'm not aware of a way to clean up the duplication in the docker services (AFAIK you can't programatically change the gpu parameters). So this set up declares 4 gpu services and then picks them based on the number of workers you want to run.

coordinator=false
# Worker REST/HTTP port for internal and admin endpoints.
http-server.http.port=8080
http-server.http.port=8081
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This port should be distinct from the port used to connect to the presto-coordinator if they are on the same machine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please expand on why this is required now? This should be running as a separate service on the docker network.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found it to be necessary when working in Slurm environments, but that is because there is no docker network. I'll remove this as it isn't a change that is required for the scope of this PR.

- ./config/generated/gpu/etc_worker/node.properties:/opt/presto-server/etc/node.properties
- ./config/generated/gpu/etc_worker/config_native.properties:/opt/presto-server/etc/config.properties

# These workers are available to run on one node with a single GPU pinned to each.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would have been nice to de-duplicate this code, but AFAIK we can't meta-program services with different NVIDIA_VISIBLE_DEVICES.

if [[ "$VARIANT_TYPE" == "java" ]]; then
DOCKER_COMPOSE_FILE="java"
conditionally_add_build_target $JAVA_WORKER_IMAGE $JAVA_WORKER_SERVICE "worker|w"
WORKERS="$JAVA_WORKER_SERVICE"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are no longer guaranteed to want all the services in a docker-compose file to run, we need to specify which default worker service we want to run.

fi

docker compose -f $DOCKER_COMPOSE_FILE_PATH up -d
function duplicate_worker_configs() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is based on the way we do multiple configs for the Slurm clusters right now. This will probably be changed in the future.

coordinator=false
# Worker REST/HTTP port for internal and admin endpoints.
http-server.http.port=8080
http-server.http.port=8081
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please expand on why this is required now? This should be running as a separate service on the docker network.

- ./config/generated/gpu/etc_worker/config_native.properties:/opt/presto-server/etc/config.properties

# These workers are available to run on one node with a single GPU pinned to each.
presto-native-worker-gpu-0:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We seem to be hardcoding and deploying a fixed number of workers?


# Adds a cluster tag for gpu variant
WORKER_CONFIG="${CONFIG_DIR}/etc_coordinator/config_native.properties"
WORKER_CONFIG="${CONFIG_DIR}/etc_worker/config_native.properties"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this was correct before; as we were referring to the coordinator's config as worker_config.

coordinator=false
# Worker REST/HTTP port for internal and admin endpoints.
http-server.http.port=8080
http-server.http.port=8081
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found it to be necessary when working in Slurm environments, but that is because there is no docker network. I'll remove this as it isn't a change that is required for the scope of this PR.


docker compose -f ../docker/docker-compose.java.yml -f ../docker/docker-compose.native-cpu.yml -f ../docker/docker-compose.native-gpu.yml down
OVERRIDE=""
[ -f ../docker/docker-compose.workers.override.yml ] && OVERRIDE="-f ../docker/docker-compose.workers.override.yml"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we generated a multi-workers file, then use that to stop containers or we may leave them dangling.

YAML
}

function generate_worker_compose() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed the setup so that the docker-compose.native-gpu file is always generated/overwritten based on the number of workers. If NUM_WORKERS is not specified, or is "1", then it should generate exactly what was there before. If NUM_WORKERS > 1 then it will generate separate services for each worker.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants