Skip to content

nyu-systems/mlsys-seminar

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 

Repository files navigation

NYU Systems Reading Group: MLSys Seminar

Schedule

Each reading group presenter should:

  • Send an reminder to the reading group slack channel with paper details and update the URL in the repository a few days prior to the group meeting (at least two days before).

Fall 2024

Date Discussion Lead Paper Title and Link Conference
2024/09/24 Haitian nnScaler: Constraint-Guided Parallelization Plan Generation for Deep Learning Training OSDI'24
2024/10/01 Hexu On Optimizing the Communication of Model Parallelism MLSys'23
2024/10/15 David Hidet: Task-Mapping Programming Paradigm for Deep Learning Tensor Programs ASPLOS'23
2024/10/22 Daniel dLoRA: Dynamically Orchestrating Requests and Adapters for LoRA LLM Serving OSDI'24
2024/10/29 Tao Accelerating Collective Communication in Data Parallel Training across Deep Learning Frameworks NSDI'22
2024/11/05 Curtis ServerlessLLM: Low-Latency Serverless Inference for Large Language Models OSDI'24
2024/11/12 Arasu Decentralized Training of Foundation Models in Heterogeneous Environments;
SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient;
CocktailSGD: Fine-tuning Foundation Models over 500Mbps Networks
NeurIPS'22;
NeurIPS'23;
ICML'23
2024/11/19 Haitian Pytorch Tensor Implementation
2024/11/26 Xiwen Oobleck: Resilient Distributed Training of Large Models Using Pipeline Templates SOSP'23
2024/12/03 Zhanghan Tenplex: Dynamic Parallelism for Deep Learning using Parallelizable Tensor Collections;
Enabling Parallelism Hot Switching for Efficient Training of Large Language Models
SOSP'24
2024/12/17 Zihao FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving MLSys'25
2025/02/06 Haitian DeepSeek-V3 Technical Report arxiv
2025/02/13 Renjie Retrieval-Augmented Generation: algorithms and systems
2025/02/20 Daniel SpotServe: Serving Generative Large Language Models on Preemptible Instances ASPLOS'24
2025/02/27 Jinkun Pipeline Parallelism with Controllable Memory NeurIPS'24
2025/03/06 Yilun Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention ACL'25
2025/03/13 Ankit Multi-GPU and Multi-CPU Parallelization for Interactive Physics Simulations EuroPar'10
2025/03/20 Jinkun Stateful Large Language Model Serving with Pensieve EuroSys'25
2025/03/27 Yunwei Atom of Thoughts for Markov LLM Test-Time Scaling arxiv
2025/04/03 Curtis Ray architecture and source code
2025/04/10 David CuTe / CUTLASS: CUDA Templates for Linear Algebra Subroutines
2025/04/24 Hexu On Scaling Up 3D Gaussian Splatting Training ICLR'25
2025/05/01 Xiwen Sequoia: Scalable and Robust Speculative Decoding NeurIPS'24
2025/05/08 Haseeb Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving SOSP'24

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •