Scanet

Type-safe, high performance, distributed Neural networks in Scala (not Python, finally...).

Intro

Low level (linear algebra) operations powered by low level TensorFlow API (C, C++ bindings via JNI).

Scala used to build computation graphs and compile them into native tensor graphs. Compiled graphs are fully calculated in native code (on CPU, GPU or TPU) and only result is returned back via DirectBufferwhich points into native memory.

DirectBuffer is wrapped with Tensor read-only object which allows to slice and read data in a convenient way (just like Breeze or Numpy does).

The optimizer is built on top of Spark and can optimize the model in a distributed/parallel way. The chosen algorithm - Data parallelism with synchronous model averaging. The dataset is split between the workers and each epoch is run independently on each data split, at the end of each epoch parameters are averaged and broadcasted back to each worker.

The input data is expected to be Dataset[Array[TensorType] and it contains a shape of the tensors in metadata. Usually, TensorType is choosen to be Float since it performs best on GPU, also Double can be used.

Examples

ANN

Example of a simple MNIST dataset classifier with Fully Connected Neural Network:

val (trainingDs, testDs) = MNIST.load(sc, trainingSize = 30000)
val model = Dense(50, Sigmoid) >> Dense(10, Softmax)
val trained = trainingDs.train(model)
  .loss(CategoricalCrossentropy)
  .using(Adam(0.01f))
  .batch(1000)
  .each(1.epochs, RecordLoss(tensorboard = true))
  .each(10.epochs, RecordAccuracy(testDs, tensorboard = true))
  .stopAfter(200.epochs)
  .run()
accuracy(trained, testDs) should be >= 0.95f

Here, loss and accuracy will be logged and added to TensorBoard as live trends. To run tensorboard execute:

tensorboard --logdir board

CNN

Same but with CNN (Convolutional Neural Network)

val (trainingDs, testDs) = MNIST()
val model =
  Conv2D(32, activation = ReLU()) >> Pool2D() >>
  Conv2D(64, activation = ReLU()) >> Pool2D() >>
  Flatten >> Dense(10, Softmax)
val trained = trainingDs
  .train(model)
  .loss(CategoricalCrossentropy)
  .using(Adam(0.001f))
  .batch(100)
  .initWith(shape => Tensor.rand(shape, range = Some(-0.1f, 0.1f)))
  .each(1.epochs, RecordLoss())
  .each(1.epochs, RecordAccuracy(testDs))
  .stopAfter(3.epochs)
  .run()
accuracy(trained, testDs) should be >= 0.98f

Road Map

Tensor Flow Low Level API

Optimizer engine

Spark
Hyper parameter tuning
Model Import/Export

Optimizer algorithms

Statistics

Variance/STD
Covariance/Correlation Matrix
Lots of other useful algs to analize the data set

Models

Activation functions

Loss functions

RMSE (Mean Squared Error)
Binary Crossentropy
Categorical Crossentropy

Benchmark Datasets

MNIST

Preprocessing

Feature scalers
Feature embedding
Hashed features
Crossed features

Estimators

r2 score
accuracy estimator,
confusion matrix, precision, recall, f1 score
runtime estimating and new stop condition based on that

CPU & GPU & TPU banchmarks

Create computation intensive operation, like matmul multiple times large tensors and compare with Scala breeze, python tensorflow, python numpy
Compare with existing implementations using local CPU
Compare with existing implementations using one GPU
Compare with existing implementations using distributed mode on GCP DataProc

Other useful things

While training analyze the weights histograms to make sure the deep NN do not saturate
Grid/Random hyper parameters search
Different weight initializers (Xavier)
Decay learning rate over time (step, exponential, 1/t decay)
Try using in interactive notebook
Add graph library so we could plot some charts and publish them in tensorboard or notebook (maybe fork and upgrade vegas to scala 2.12 ot try evil-plot)

Refactoring

Refactor type class hierarchy so TensorType was on top and Numeric and the rest would extend it.
Refactor tensor functions so the materialized type of args was only infered during compilation Also we would need to try simplifying tensor functions and add methods so we could compose functions (compose, endThen, etc)
Add DSL to build tensor requirements like tensor require rank(4), tensor require shape squratedMatrix

If you want to become a contributor, you are welcome!!! You can pick anything from a Road Map or propose your idea.

Please, contact:

[email protected]
[email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 112 Commits
benchmark/keras		benchmark/keras
project		project
src		src
.gitignore		.gitignore
.scalafmt.conf		.scalafmt.conf
.travis.yml		.travis.yml
LICENSE		LICENSE
README.md		README.md
build.sbt		build.sbt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Scanet

Intro

Examples

ANN

CNN

Road Map

Tensor Flow Low Level API

Optimizer engine

Optimizer algorithms

Statistics

Models

Activation functions

Loss functions

Benchmark Datasets

Preprocessing

Estimators

CPU & GPU & TPU banchmarks

Other useful things

Refactoring

About

Releases

Packages

Languages

License

mullerhai/scanet3

Folders and files

Latest commit

History

Repository files navigation

Scanet

Intro

Examples

ANN

CNN

Road Map

Tensor Flow Low Level API

Optimizer engine

Optimizer algorithms

Statistics

Models

Activation functions

Loss functions

Benchmark Datasets

Preprocessing

Estimators

CPU & GPU & TPU banchmarks

Other useful things

Refactoring

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages