This repo demonstrates a SageMaker ML training workflow using PyTorch Lightning with various SageMaker features, including:
- SageMaker Data Parallel
- Customized PyTorch Lightning callbacks
- Streaming large data directly from S3
- SageMaker Debugger and Profiler
- PyTorch Profiler
- Tensorboard support in SageMaker Studio
This tutorial is designed to be completely self contained, including obtaining and formatting the data.