-
Caffe
2015 Large Scale Distributed Deep Learning on Hadoop Clusters
2014 MM Caffe: Convolutional Architecture for Fast Feature Embedding -
CNTK 2014 MSR-TR An introduction to computational networks and the computational network toolkit
2014 OSDI Project Adam: Building an Efficient and Scalable Deep Learning Training System -
MXNet
2016 arXiv Training Deep Nets with Sublinear Memory Cost 2015 NIPSW MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems [GTC'16 Tutorial]
2014 NIPSW Minerva: A Scalable and Highly Efficient Training Platform for Deep Learning
2014 ICLR Purine: A bi-graph based deep learning framework -
Neon
2015 arXiv Fast Algorithms for Convolutional Neural Networks (Winograd) [Blog] -
Spark
2016 arXiv SparkNet: Training Deep Networks in Spark
2016 arXiv DeepSpark: A Spark-Based Distributed Deep Learning Framework for Commodity Clusters -
TensorFlow
2016 OSDI BigArray: A System for Large-Scale Machine Learning
2016 arXiv TensorFlow: A system for large-scale machine learning
2015 TensorFlow:Large-Scale Machine Learning on Heterogeneous Distributed Systems [slides 1] [slides 2] 2012 NIPS Large Scale Distributed Deep Networks (DistBelief)
2014 NIPSW Techniques and Systems for Training Large Neural Networks Quickly -
Theano
2016 arXiv Theano: A Python framework for fast computation of mathematical expressions -
Torch
2016 NIPSW Torchnet: An Open-Source Platform for (Deep) Learning Research
2011 NIPSW Torch7: A Matlab-like Environment for Machine Learning
- 2016 ICML Deep Speech 2: End-to-End Speech Recognition in English and Mandarin
- 2015 ICMLW Massively Parallel Methods for Deep Reinforcement Learning
- 2015 arXiv Deep Image: Scaling up Image Recognition
- 2013 ICML Deep learning with COTS HPC systems
- 2015 Intel Single Node Caffe Scoring and Training on Intel® Xeon E5-Series Processors
- 2015 arXiv Caffe con Troll: Shallow Ideas to Speed Up Deep Learning
- 2015 ICMLW Massively Parallel Methods for Deep Reinforcement Learning
- 2015 arXiv Convolutional Neural Networks at Constrained Time Cost
- 2014 arXiv One weird trick for parallelizing convolutional neural networks
- 2014 NIPS On the Computational Efficiency of Training Neural Networks
- 2011 NIPSW Improving the speed of neural networks on CPUs