A GPipe implementation in PyTorch
-
Updated
Jul 25, 2024 - Python
A GPipe implementation in PyTorch
An I/O benchmark for deep Learning applications
Very-Low Overhead Checkpointing System
Extending DOLFINx with checkpointing functionality
Keras wrapper that autosaves what ModelCheckpoint cannot.
A python package for performing memory intensive computations in parallel using chunks and checkpointing.
A python package for checkpointing, saving, and loading objects.
This FLINK project will consume streams from an azure event-hub and produce to a different event-hub ,and the config files for deploying the same in kubernetes
Code and tutorial on integrating wandb sweeps with Slurm pre-emption
A lightweight checkpointing program written in C.
DMTCP scripts to get Python scripts working with SLURM.
A shared library to help test your code with failure-injection
This is a standalone flink producer using for testing the flink-consume-produce-ek repo contents
Robust distributed checkpointing and job management system for multi-GPU SLURM workloads
Hangman Game Word Predictor (Character-level attention)
A digital album face recognition manager, that isolates images of a specified person from a digital album.
Add a description, image, and links to the checkpointing topic page so that developers can more easily learn about it.
To associate your repository with the checkpointing topic, visit your repo's landing page and select "manage topics."