unidist is a framework that is intended to provide the unified API for distributed execution by supporting various performant execution backends. At the moment the following backends are supported under the hood:
unidist is designed to work in a task-based parallel model.
Also, the framework provides a Python Sequential backend (pyseq
), that can be used for debugging.
unidist can be installed with pip
on Linux, Windows and MacOS:
pip install unidist # Install unidist with dependencies for Python Multiprocessing and Python Sequential backends
unidist can also be used with Dask, MPI or Ray execution backend. If you don't have Dask, MPI or Ray installed, you will need to install unidist with one of the targets:
pip install unidist[all] # Install unidist with dependencies for all the backends
pip install unidist[dask] # Install unidist with dependencies for Dask backend
pip install unidist[mpi] # Install unidist with dependencies for MPI backend
pip install unidist[ray] # Install unidist with dependencies for Ray backend
unidist automatically detects which execution backends are installed and uses that for scheduling computation.
For installing unidist with dependencies for Dask and MPI execution backends into a conda environment the following command should be used:
conda install unidist-dask unidist-mpi -c conda-forge
All set of backends could be available in a conda environment by specifying:
conda install unidist-all -c conda-forge
or explicitly:
conda install unidist-dask unidist-mpi unidist-ray -c conda-forge
For more information refer to Installation section.
If you want to choose a specific execution backend to run on,
you can set the environment variable UNIDIST_BACKEND
and unidist will do computation with that backend:
export UNIDIST_BACKEND=ray # unidist will use Ray
export UNIDIST_BACKEND=mpi # unidist will use MPI
export UNIDIST_BACKEND=dask # unidist will use Dask
This can also be done within a notebook/interpreter before you initialize unidist:
from unidist.config import Backend
Backend.put("ray") # unidist will use Ray
Backend.put("mpi") # unidist will use MPI
Backend.put("dask") # unidist will use Dask
If you have installed all the execution backends and haven't specified any of the execution backends, Ray is used by default.
Since some of the execution backends, particularly, MPI, have some specifics regarding running python programs, please refer to Using Unidist section to get more information on setting the execution backend to run on.
# script.py
import unidist
unidist.init() # Ray backend is used by default
@unidist.remote
def foo(x):
return x * x
# This will run `foo` on a pool of workers in parallel;
# `refs` will contain object references to actual data
refs = [foo.remote(i) for i in range(5)]
# To get the data call `unidist.get(...)`
print(unidist.get(refs))
Run the script.py
with:
$ python script.py
[0, 1, 4, 9, 16] # output
For more examples refer to Getting Started section in our documentation.
unidist is meant to be used not only directly by users to get better performance in their workloads,
but also be a core component of other libraries to power those with the performant execution backends.
Refer to Libraries powered by unidist
section of Using Unidist page
to get more information on which libraries have already been using unidist.
Visit the complete documentation on readthedocs: https://unidist.readthedocs.io.