gpu-parallelism

Here are 9 public repositories matching this topic...

SciML / DiffEqGPU.jl

GPU-acceleration routines for DifferentialEquations.jl and the broader SciML scientific machine learning ecosystem

gpu ode dde differential-equations differentialequations sde dae neural-ode scientific-machine-learning neural-differential-equations gpu-parallelism sciml

Updated Aug 4, 2025
Julia

rbga / CUDA-Merge-and-Bitonic-Sort

Star

Efficient implementations of Merge Sort and Bitonic Sort algorithms using CUDA for GPU parallel processing, resulting in accelerated sorting of large arrays. Includes both CPU and GPU versions, along with a performance comparison.

Updated Jul 27, 2023
Cuda

babak2 / OptimizedSum

Star

Optimized Parallel Sum program demonstrating CPU vs GPU performance

visual-studio cuda gpu-acceleration gpu-computing cuda-programming gpu-parallelism

Updated Nov 9, 2023
Cuda

oekosheri / tensorflow_unet_scaling

Star

Scaling Unet in Tensorflow

tensorflow data-parallelism horovod unet-image-segmentation gpu-parallelism mirrored-strategy multi-worker-strategy

Updated Nov 13, 2023
Jupyter Notebook

LuizaValezim / movie-marathon-problem

Star

algorithm cpu openmp np-hard insper gpu-parallelism supercomp

Updated Jun 8, 2023
Jupyter Notebook

daun-io / gpu-experiment-parallelization

Sponsor

Star

Introduction to the concept of automatic experiment parallelization

python tensorflow gpu parallelization pytorch deeplearning gpu-parallelism

Updated Jul 27, 2020

kiankyars / Ultra-Scale-Playbook-Series

Star

data gpu gpu-acceleration data-parallelism gpu-parallelism

Updated Jun 29, 2025
Jupyter Notebook

nezamtrm / GPU-parllelism-for-finetuning-huggingface-NLP-models-on-bittensor-dataset

Star

nlp parallelism nlp-machine-learning fine-tuning huggingface gpu-parallelism huggingface-transformers gpt-j bittensor eleutherai

Updated May 7, 2023
Jupyter Notebook

cmontemuino / amd-mi300x-ml-benchmarks

Star

Comprehensive machine learning benchmarking framework for AMD MI300X GPUs on Dell PowerEdge XE9680 hardware. Supports both inference (vLLM) and training workloads with containerized test suites, hardware monitoring, and analysis tools for performance, power efficiency, and scalability research across the complete ML pipeline.

Updated Aug 9, 2025

Improve this page

Add a description, image, and links to the gpu-parallelism topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the gpu-parallelism topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gpu-parallelism

Here are 9 public repositories matching this topic...

SciML / DiffEqGPU.jl

rbga / CUDA-Merge-and-Bitonic-Sort

babak2 / OptimizedSum

oekosheri / tensorflow_unet_scaling

LuizaValezim / movie-marathon-problem

daun-io / gpu-experiment-parallelization

kiankyars / Ultra-Scale-Playbook-Series

nezamtrm / GPU-parllelism-for-finetuning-huggingface-NLP-models-on-bittensor-dataset

cmontemuino / amd-mi300x-ml-benchmarks

Improve this page

Add this topic to your repo