NCCL only multi-gpu multi-node training without MPI #923
ci.yml
on: pull_request
build-cuda-windows
2m 16s
build-cuda-fp32
1m 32s
build-cuda-bf16
1m 17s
build-cuda-fp16
1m 22s
build-cuda-kernels
1m 35s
Matrix: build-and-test-cpu