Skip to content

Commit b83e614

Browse files
committed
Update config file to v2.0
1 parent 9251d7b commit b83e614

File tree

3 files changed

+21
-111
lines changed

3 files changed

+21
-111
lines changed

boston_housing/README.md

Lines changed: 14 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -2,12 +2,11 @@
22

33
In this tutorial we're going to use the [Boston Housing Dataset](https://www.cs.toronto.edu/~delve/data/boston/bostonDetail.html). We'll take an existing implementation, create the needed files to pack it into MLCube and execute all tasks.
44

5-
65
## Original project code
76

87
At fist we have only 4 files, one for package dependencies and 3 scripts for each task: download data, preprocess data and train.
98

10-
```
9+
```bash
1110
├── project
1211
├── 01_download_dataset.py
1312
├── 02_preprocess_dataset.py
@@ -30,12 +29,11 @@ The most important thing that we need to remember about these scripts are the in
3029
**--dataset_file_path** : Processed dataset file path. Note: this is the full path to the csv file.
3130
**--n_estimators** : Number of boosting stages to perform. In this case we're using a gradient boosting regressor.
3231

33-
3432
## MLCube scructure
3533

3634
We'll need a couple of files for MLCube, first we'll need to create a folder called **mlcube** in the same path from as project folder. We'll need to create the following structure (for this tutorial the files are already in place)
3735

38-
```
36+
```bash
3937
├── mlcube
4038
│   ├── mlcube.yaml
4139
│   ├── mlcube_cli.py
@@ -103,7 +101,6 @@ process.wait()
103101

104102
In this tutorial we already have a shell script containing the steps to run the train task, the file is: **project/run_and_time.sh**, please take a look and study its content.
105103

106-
107104
### MLCube Python CLI file
108105

109106
The **mlcube/mlcube_cli.py** file simulates MLCube CLI. It is temporary stored here, and is part of MLCube library. The only command avaibale to execute is `run`, and the possible arguments are:
@@ -115,11 +112,10 @@ The **mlcube/mlcube_cli.py** file simulates MLCube CLI. It is temporary stored h
115112

116113
Example:
117114

118-
```
115+
```bash
119116
python mlcube_cli.py run --mlcube ./ --task train --platform docker
120117
```
121118

122-
123119
### MLCube Python entrypoint file
124120

125121
At this point we know how to execute the tasks sripts from Python code, now we can create a file that contains the definition on how to run each task.
@@ -142,7 +138,7 @@ Keep in mind the tag that we just described.
142138

143139
At this point our solution folder structure should look like this:
144140

145-
```
141+
```bash
146142
├── mlcube
147143
│   ├── mlcube.yaml
148144
│   ├── mlcube_cli.py
@@ -158,7 +154,6 @@ At this point our solution folder structure should look like this:
158154
└── run_and_time.sh
159155
```
160156

161-
162157
### Define MLCube files
163158

164159
Inside the mlcube folder we'll need to define the following files.
@@ -181,15 +176,17 @@ This file is already provided, please take a look and study its content.
181176

182177
With this file we have finished the packing of the project into MLCube! Now we can setup the project and run all the tasks.
183178

184-
185179
### Project setup
186-
```Python
180+
181+
```bash
187182
# Create Python environment
188183
virtualenv -p python3 ./env && source ./env/bin/activate
189-
# Install MLCube and MLCube docker runner from GitHub repository (normally, users will just run `pip install mlcube mlcube_docker`)
190-
git clone https://github.com/sergey-serebryakov/mlbox.git && cd mlbox && git checkout feature/configV2
191-
cd ./runners/mlcube_docker && export PYTHONPATH=$(pwd)
192-
cd ../../ && pip install -r mlcube/requirements.txt && pip install omegaconf && cd ../
184+
185+
# Install MLCube and MLCube docker runner from GitHub repository
186+
# (normally, users will just run `pip install mlcube mlcube_docker`)
187+
git clone https://github.com/mlcommons/mlcube && cd mlcube/mlcube
188+
python setup.py bdist_wheel && pip install --force-reinstall ./dist/mlcube-* && cd ..
189+
cd ./runners/mlcube_docker && python setup.py bdist_wheel && pip install --force-reinstall --no-deps ./dist/mlcube_docker-* && cd ../../..
193190

194191
# Fetch the boston housing example from GitHub
195192
git clone https://github.com/mlcommons/mlcube_examples && cd ./mlcube_examples
@@ -208,7 +205,8 @@ The [Boston Housing Dataset](https://www.cs.toronto.edu/~delve/data/boston/bosto
208205
| Total | (After all tasks) | All | ~92 KB |
209206

210207
### Tasks execution
211-
```
208+
209+
```bash
212210
# Download Boston housing dataset. Default path = /workspace/data
213211
# To override it, use --data_dir=DATA_DIR
214212
python mlcube_cli.py run --task download_data

boston_housing/mlcube/mlcube.yaml

Lines changed: 7 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ authors:
66
platform:
77
accelerator_count: 0
88

9-
container:
9+
docker:
1010
# Image name.
1111
image: mlcommons/boston_housing:0.0.1
1212
# Docker build context relative to $MLCUBE_ROOT. Default is `build`.
@@ -17,18 +17,16 @@ container:
1717
tasks:
1818
download_data:
1919
# Download boston housing dataset
20-
io:
20+
parameters:
2121
# Directory where dataset will be saved.
22-
- {name: data_dir, type: directory, io: output, default: $WORKSPACE/data}
22+
outputs: {data_dir: data/}
2323
preprocess_data:
2424
# Preprocess dataset
25-
io:
25+
parameters:
2626
# Same directory location where dataset was downloaded
27-
- {name: data_dir, type: directory, io: output, default: $WORKSPACE/data}
27+
inputs: {data_dir: data/}
2828
train:
2929
# Train gradient boosting regressor model
30-
io:
30+
parameters:
3131
# Processed dataset file
32-
- {name: dataset_file_path, type: file, io: input, default: $WORKSPACE/data/processed_dataset.csv}
33-
# Yaml file with training parameters.
34-
- {name: parameters_file, type: file, io: input, default: $WORKSPACE/parameters.yaml}
32+
inputs: {dataset_file_path: data/processed_dataset.csv, parameters_file: parameters.yaml}

boston_housing/mlcube/mlcube_cli.py

Lines changed: 0 additions & 86 deletions
This file was deleted.

0 commit comments

Comments
 (0)