Skip to content

Commit ca701f6

Browse files
authored
Document extra ML requirements for MLflow (mlflow#3495)
1 parent 9d1437e commit ca701f6

12 files changed

+93
-54
lines changed

.github/workflows/master.yml

+2-1
Original file line numberDiff line numberDiff line change
@@ -90,7 +90,7 @@ jobs:
9090
- uses: actions/checkout@v2
9191
- name: Install dependencies
9292
env:
93-
INSTALL_LARGE_PYTHON_DEPS: true
93+
INSTALL_SMALL_PYTHON_DEPS: true
9494
run: |
9595
source ./dev/install-common-deps.sh
9696
- name: Run tests
@@ -106,6 +106,7 @@ jobs:
106106
- name: Install dependencies
107107
env:
108108
INSTALL_LARGE_PYTHON_DEPS: true
109+
INSTALL_SMALL_PYTHON_DEPS: true
109110
run: |
110111
source ./dev/install-common-deps.sh
111112
- name: Run tests

EXTRA_DEPENDENCIES.rst

+19
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
=========================
2+
Extra MLflow Dependencies
3+
=========================
4+
5+
When you `install the MLflow Python package <https://mlflow.org/docs/latest/quickstart.html#installing-mlflow>`_,
6+
a set of core dependencies needed to use most MLflow functionality (tracking, projects, models APIs)
7+
is also installed.
8+
9+
However, in order to use certain framework-specific MLflow APIs or configuration options,
10+
you need to install additional, "extra" dependencies. For example, the model persistence APIs under
11+
the ``mlflow.sklearn`` module require scikit-learn to be installed. Some of the most common MLflow
12+
extra dependencies can be installed via ``pip install mlflow[extras]``.
13+
14+
The full set of extra dependencies are documented, along with the modules that depend on them,
15+
in the following files:
16+
17+
* extra-ml-requirements.txt: ML libraries needed to use model persistence and inference APIs
18+
* small-requirements.txt, large-requirements.txt: Libraries required to use non-default
19+
artifact-logging and tracking server configurations

conftest.py

+3
Original file line numberDiff line numberDiff line change
@@ -85,6 +85,9 @@ def pytest_ignore_collect(path, config):
8585
"tests/spacy",
8686
"tests/spark_autologging",
8787
"tests/fastai",
88+
"tests/models",
89+
"tests/shap",
90+
"tests/utils/test_model_utils.py",
8891
]
8992

9093
relpath = os.path.relpath(str(path))

dev/extra-ml-requirements.txt

+38
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
## This file describes extra ML library dependencies that you, as an end user,
2+
## must install in order to use various MLflow Python modules.
3+
##
4+
# Required by mlflow.azureml
5+
azureml-sdk==1.2.0; python_version >= "3.0"
6+
# Required by mlflow.pyfunc
7+
cloudpickle==0.8.0
8+
# Required by mlflow.keras
9+
keras==2.3.1
10+
# Required by mlflow.sklearn
11+
scikit-learn==0.23.2
12+
# Required by mlflow.gluon
13+
mxnet==1.5.0
14+
# Required by mlflow.fastai
15+
fastai==1.0.60
16+
# Required by mlflow.spacy
17+
spacy==2.2.3
18+
# Required by mlflow.tensorflow
19+
tensorflow==1.15.2
20+
# Required by mlflow.pytorch
21+
torch==1.4.0
22+
torchvision==0.5.0
23+
# Required by mlflow.xgboost
24+
xgboost>=0.82
25+
# Required by mlflow.lightgbm
26+
lightgbm==2.3.0
27+
# Required by mlflow.h2o
28+
h2o==3.22.1.4
29+
# Required by mlflow.onnx
30+
onnx==1.4.1;
31+
onnxmltools==1.4.0;
32+
onnxruntime==0.3.0;
33+
# Required by mlflow.mleap, and in order to save SparkML models in
34+
# mleap format via ``mlflow.spark.log_model``, ``mlflow.spark.save_model``
35+
mleap==0.16.0
36+
# Required by mlflow.spark
37+
pyspark==2.4.0
38+

dev/install-common-deps.sh

+1-1
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,6 @@ CONDA_DIR=/usr/share/miniconda
2222
export PATH="$CONDA_DIR/bin:$PATH"
2323
hash -r
2424
conda config --set always_yes yes --set changeps1 no
25-
2625
# Useful for debugging any issues with conda
2726
conda info -a
2827
conda create -q -n test-environment python=3.6
@@ -40,6 +39,7 @@ if [[ "$INSTALL_SMALL_PYTHON_DEPS" == "true" ]]; then
4039
fi
4140
if [[ "$INSTALL_LARGE_PYTHON_DEPS" == "true" ]]; then
4241
retry-with-backoff pip install --quiet -r ./dev/large-requirements.txt
42+
retry-with-backoff pip install --quiet -r ./dev/extra-ml-requirements.txt
4343
# Hack: make sure all spark-* scripts are executable.
4444
# Conda installs 2 version spark-* scripts and makes the ones spark
4545
# uses not executable. This is a temporary fix to unblock the tests.

dev/large-requirements.txt

+4-38
Original file line numberDiff line numberDiff line change
@@ -1,39 +1,5 @@
1-
# Large test reqs
2-
azure-storage-blob==12.3.0
3-
google-cloud-storage==1.14.0
4-
botocore==1.12.84
5-
boto3==1.9.84
6-
moto==1.3.7
7-
h2o==3.22.1.4
8-
onnx==1.4.1;
9-
onnxmltools==1.4.0;
10-
onnxruntime==0.3.0;
11-
mleap==0.16.0
12-
mxnet==1.5.0
13-
pyarrow==0.17.0
14-
pyspark==2.4.0
15-
pytest==3.2.1
16-
pytest-cov==2.6.0
17-
scikit-learn==0.23.2
18-
scipy==1.2.1
19-
spacy==2.2.3
20-
tensorflow==1.15.2
21-
tf2onnx==1.5.4;
22-
torch==1.4.0
23-
torchvision==0.5.0
24-
xgboost>=0.82
25-
lightgbm==2.3.0
26-
# Install typing to fix torch on Python 2.7 (see https://github.com/pytorch/pytorch/issues/16775)
27-
typing==3.6.6
28-
pysftp==0.2.9
29-
keras==2.3.1
30-
attrdict==2.0.0
31-
azureml-sdk==1.2.0; python_version >= "3.0"
32-
cloudpickle==0.8.0
33-
pytest-localserver==0.5.0
34-
sqlalchemy==1.3.0
35-
kubernetes==9.0.0
36-
fastai==1.0.60
1+
## Large test reqs
2+
## Test-only dependencies
373
matplotlib==3.2.1
38-
# test plugin
39-
tests/resources/mlflow-test-plugin/
4+
tf2onnx==1.5.4
5+
pyarrow==0.17.0

dev/run-large-python-tests.sh

+1-1
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,6 @@ trap 'err=1' ERR
77
export MLFLOW_HOME=$(pwd)
88

99
# NB: Also add --ignore'd tests to run-small-python-tests.sh
10-
pytest tests --large --ignore-flavors --ignore=tests/examples --ignore=tests/models
10+
pytest tests --large --ignore-flavors --ignore=tests/examples
1111

1212
test $err = 0

dev/run-python-flavor-tests.sh

+2
Original file line numberDiff line numberDiff line change
@@ -21,5 +21,7 @@ pytest --verbose tests/gluon --large
2121
pytest --verbose tests/gluon_autolog --large
2222
pytest --verbose tests/spacy --large
2323
pytest --verbose tests/fastai --large
24+
pytest --verbose tests/utils/test_model_utils.py --large
25+
2426

2527
test $err = 0

dev/small-requirements.txt

+15-10
Original file line numberDiff line numberDiff line change
@@ -1,21 +1,26 @@
1-
# Small test reqs
1+
## Small test reqs
2+
# Required
23
azure-storage-blob==12.3.0
4+
# Required to log artifacts and models to GCS artifact locations
35
google-cloud-storage==1.14.0
6+
# Required to log artifacts and models to AWS S3 artifact locations
47
botocore==1.12.84 # pinned for moto
58
boto3==1.9.84 # pinned for moto
6-
moto==1.3.7
7-
mxnet==1.5.0
8-
scikit-learn==0.23.2
9-
matplotlib==3.2.1
10-
scipy==1.2.1
9+
# Required to log artifacts and models to HDFS artifact locations
1110
pyarrow==0.17.0
11+
# Required to log artifacts to SFTP artifact locations
1212
pysftp==0.2.9
13+
# Required to run the MLflow server against SQL-backed storage
14+
sqlalchemy==1.3.0
15+
# Required by the mlflow.projects module, when running projects against
16+
# a remote Kubernetes cluster
17+
kubernetes==9.0.0
18+
## Test-only dependencies
1319
attrdict==2.0.0
14-
cloudpickle==0.8.0
1520
pytest==3.2.1
1621
pytest-cov==2.6.0
1722
pytest-localserver==0.5.0
18-
sqlalchemy==1.3.0
19-
kubernetes==9.0.0
20-
# test plugin
23+
scipy==1.2.1
24+
moto==1.3.7
25+
# Test plugin, used to verify correctness of MLflow plugin APIs
2126
tests/resources/mlflow-test-plugin/

docs/source/quickstart.rst

+4-3
Original file line numberDiff line numberDiff line change
@@ -25,9 +25,10 @@ You install MLflow by running:
2525
installing Python 3 through the `Homebrew <https://brew.sh/>`_ package manager using
2626
``brew install python``. (In this case, installing MLflow is now ``pip3 install mlflow``).
2727

28-
In order to use some of MLflow features (ML librairies, storage options, ...), you may need to install extra libraries.
29-
For example, the ``mlflow.tensorflow`` module requires TensorFlow to be installed.
30-
Please refer to this `requirements.txt <https://github.com/mlflow/mlflow/blob/master/dev/large-requirements.txt>`_ for a complete list of dependencies.
28+
To use certain MLflow modules and functionality (ML model persistence/inference, artifact storage options, etc),
29+
you may need to install extra libraries. For example, the ``mlflow.tensorflow`` module requires TensorFlow to be installed.
30+
See https://github.com/mlflow/mlflow/blob/master/EXTRA_DEPENDENCIES.rst for more details
31+
3132
At this point we recommend you follow the :doc:`tutorial<tutorials-and-examples/tutorial>` for a walk-through on how you
3233
can leverage MLflow in your daily workflow.
3334

test-requirements.txt

+1
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
11
-r dev/small-requirements.txt
22
-r dev/large-requirements.txt
3+
-r dev/extra-ml-requirements.txt
34
-r dev/lint-requirements.txt

tests/models/test_cli.py

+3
Original file line numberDiff line numberDiff line change
@@ -64,6 +64,7 @@ def sk_model(iris_data):
6464
return knn_model
6565

6666

67+
@pytest.mark.large
6768
def test_predict_with_old_mlflow_in_conda_and_with_orient_records(iris_data):
6869
if no_conda:
6970
pytest.skip("This test needs conda.")
@@ -114,6 +115,7 @@ def test_predict_with_old_mlflow_in_conda_and_with_orient_records(iris_data):
114115
assert all(expected == actual)
115116

116117

118+
@pytest.mark.large
117119
def test_mlflow_is_not_installed_unless_specified():
118120
if no_conda:
119121
pytest.skip("This test requires conda.")
@@ -138,6 +140,7 @@ def test_mlflow_is_not_installed_unless_specified():
138140
assert "ImportError: No module named mlflow.pyfunc.scoring_server" in stderr
139141

140142

143+
@pytest.mark.large
141144
def test_model_with_no_deployable_flavors_fails_pollitely():
142145
from mlflow.models import Model
143146

0 commit comments

Comments
 (0)