Skip to content

Commit 19bf983

Browse files
committed
C2 W4 Lab 4: added markdown and code comments
1 parent 14ad137 commit 19bf983

File tree

1 file changed

+106
-37
lines changed

1 file changed

+106
-37
lines changed

Course 2 - Custom Training loops, Gradients and Distributed Training/Week 4 - Distribution Strategy/C2_W4_Lab_4_one-device-strategy.ipynb

+106-37
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@
1313
"source": [
1414
"# One Device Strategy \n",
1515
"\n",
16-
"In this ungraded lab you'll learn to set up a One Device Strategy"
16+
"In this ungraded lab, you'll learn how to set up a [One Device Strategy](https://www.tensorflow.org/api_docs/python/tf/distribute/OneDeviceStrategy). This is typically used to deliberately test your code on a single device. This can be used before switching to a different strategy that distributes across multiple devices. Please click on the **Open in Colab** badge above so you can download the datasets and use a GPU-enabled lab environment."
1717
]
1818
},
1919
{
@@ -42,15 +42,44 @@
4242
"import tensorflow as tf\n",
4343
"import tensorflow_hub as hub\n",
4444
"import tensorflow_datasets as tfds\n",
45-
"tfds.disable_progress_bar()\n",
46-
"devices = tf.config.list_physical_devices('CPU')\n",
47-
"cpu_name = devices[0].name\n",
4845
"\n",
49-
"devices = tf.config.list_physical_devices('CPU')\n",
46+
"tfds.disable_progress_bar()"
47+
]
48+
},
49+
{
50+
"cell_type": "markdown",
51+
"metadata": {},
52+
"source": [
53+
"## Define the Distribution Strategy\n",
54+
"\n",
55+
"You can list available devices in your machine and specify a device type. This allows you to verify the device name to pass in `tf.distribute.OneDeviceStrategy()`."
56+
]
57+
},
58+
{
59+
"cell_type": "code",
60+
"execution_count": null,
61+
"metadata": {},
62+
"outputs": [],
63+
"source": [
64+
"# choose a device type such as CPU or GPU\n",
65+
"devices = tf.config.list_physical_devices('GPU')\n",
5066
"print(devices[0])\n",
51-
"# You'll see that the name will look something like \"/physical_device:CPU:0\"\n",
52-
"# Just take the CPU:0 part and use that as the name\n",
53-
"cpu_name = \"CPU:0\""
67+
"\n",
68+
"# You'll see that the name will look something like \"/physical_device:GPU:0\"\n",
69+
"# Just take the GPU:0 part and use that as the name\n",
70+
"gpu_name = \"GPU:0\"\n",
71+
"\n",
72+
"# define the strategy and pass in the device name\n",
73+
"one_strategy = tf.distribute.OneDeviceStrategy(device=gpu_name)"
74+
]
75+
},
76+
{
77+
"cell_type": "markdown",
78+
"metadata": {},
79+
"source": [
80+
"## Parameters\n",
81+
"\n",
82+
"We'll define a few global variables for setting up the model and dataset."
5483
]
5584
},
5685
{
@@ -66,9 +95,20 @@
6695
"pixels = 224\n",
6796
"MODULE_HANDLE = 'https://tfhub.dev/tensorflow/resnet_50/feature_vector/1'\n",
6897
"IMAGE_SIZE = (pixels, pixels)\n",
98+
"BATCH_SIZE = 32\n",
99+
"\n",
69100
"print(\"Using {} with input size {}\".format(MODULE_HANDLE, IMAGE_SIZE))"
70101
]
71102
},
103+
{
104+
"cell_type": "markdown",
105+
"metadata": {},
106+
"source": [
107+
"## Download and Prepare the Dataset\n",
108+
"\n",
109+
"We will use the [Cats vs Dogs](https://www.tensorflow.org/datasets/catalog/cats_vs_dogs) dataset and we will fetch it via TFDS."
110+
]
111+
},
72112
{
73113
"cell_type": "code",
74114
"execution_count": null,
@@ -80,12 +120,11 @@
80120
"outputs": [],
81121
"source": [
82122
"splits = ['train[:80%]', 'train[80%:90%]', 'train[90%:]']\n",
83-
"#data, info = tfds.load('cats_vs_dogs', with_info=True, as_supervised=True, split=splits)\n",
84123
"\n",
85124
"(train_examples, validation_examples, test_examples), info = tfds.load('cats_vs_dogs', with_info=True, as_supervised=True, split=splits)\n",
86125
"\n",
87126
"num_examples = info.splits['train'].num_examples\n",
88-
"num_classes = info.features['label'].num_classes\n"
127+
"num_classes = info.features['label'].num_classes"
89128
]
90129
},
91130
{
@@ -98,9 +137,10 @@
98137
},
99138
"outputs": [],
100139
"source": [
140+
"# resize the image and normalize pixel values\n",
101141
"def format_image(image, label):\n",
102-
" image = tf.image.resize(image, IMAGE_SIZE) / 255.0\n",
103-
" return image, label"
142+
" image = tf.image.resize(image, IMAGE_SIZE) / 255.0\n",
143+
" return image, label"
104144
]
105145
},
106146
{
@@ -113,7 +153,7 @@
113153
},
114154
"outputs": [],
115155
"source": [
116-
"BATCH_SIZE = 32\n",
156+
"# prepare batches\n",
117157
"train_batches = train_examples.shuffle(num_examples // 4).map(format_image).batch(BATCH_SIZE).prefetch(1)\n",
118158
"validation_batches = validation_examples.map(format_image).batch(BATCH_SIZE).prefetch(1)\n",
119159
"test_batches = test_examples.map(format_image).batch(1)"
@@ -129,10 +169,20 @@
129169
},
130170
"outputs": [],
131171
"source": [
172+
"# check if the batches have the correct size and the images have the correct shape\n",
132173
"for image_batch, label_batch in train_batches.take(1):\n",
133-
" pass\n",
174+
" pass\n",
134175
"\n",
135-
"image_batch.shape"
176+
"print(image_batch.shape)"
177+
]
178+
},
179+
{
180+
"cell_type": "markdown",
181+
"metadata": {},
182+
"source": [
183+
"## Define and Configure the Model\n",
184+
"\n",
185+
"As with other strategies, setting up the model requires minimal code changes. Let's first define a utility function to build and compile the model."
136186
]
137187
},
138188
{
@@ -145,7 +195,8 @@
145195
},
146196
"outputs": [],
147197
"source": [
148-
"do_fine_tuning = False #@param {type:\"boolean\"}"
198+
"# tells if we want to freeze the layer weights of our feature extractor during training\n",
199+
"do_fine_tuning = False"
149200
]
150201
},
151202
{
@@ -159,22 +210,37 @@
159210
"outputs": [],
160211
"source": [
161212
"def build_and_compile_model():\n",
162-
" print(\"Building model with\", MODULE_HANDLE)\n",
163-
" feature_extractor = hub.KerasLayer(MODULE_HANDLE,\n",
213+
" print(\"Building model with\", MODULE_HANDLE)\n",
214+
"\n",
215+
" # configures the feature extractor fetched from TF Hub\n",
216+
" feature_extractor = hub.KerasLayer(MODULE_HANDLE,\n",
164217
" input_shape=IMAGE_SIZE + (3,), \n",
165218
" trainable=do_fine_tuning)\n",
166-
" model = tf.keras.Sequential([\n",
219+
"\n",
220+
" # define the model\n",
221+
" model = tf.keras.Sequential([\n",
167222
" feature_extractor,\n",
223+
" # append a dense with softmax for the number of classes\n",
168224
" tf.keras.layers.Dense(num_classes, activation='softmax')\n",
169-
" ])\n",
170-
" model.summary()\n",
225+
" ])\n",
171226
"\n",
172-
" optimizer = tf.keras.optimizers.SGD(lr=0.002, momentum=0.9) if do_fine_tuning else 'adam'\n",
173-
" model.compile(optimizer=optimizer,\n",
227+
" # display summary\n",
228+
" model.summary()\n",
229+
"\n",
230+
" # configure the optimizer, loss and metrics\n",
231+
" optimizer = tf.keras.optimizers.SGD(lr=0.002, momentum=0.9) if do_fine_tuning else 'adam'\n",
232+
" model.compile(optimizer=optimizer,\n",
174233
" loss='sparse_categorical_crossentropy',\n",
175234
" metrics=['accuracy'])\n",
176-
" \n",
177-
" return model"
235+
"\n",
236+
" return model"
237+
]
238+
},
239+
{
240+
"cell_type": "markdown",
241+
"metadata": {},
242+
"source": [
243+
"You can now call the function under the strategy scope. This places variables and computations on the device you specified earlier."
178244
]
179245
},
180246
{
@@ -187,9 +253,16 @@
187253
},
188254
"outputs": [],
189255
"source": [
190-
"one_strategy = tf.distribute.OneDeviceStrategy(device=cpu_name)\n",
256+
"# build and compile under the strategy scope\n",
191257
"with one_strategy.scope():\n",
192-
" model = build_and_compile_model()"
258+
" model = build_and_compile_model()"
259+
]
260+
},
261+
{
262+
"cell_type": "markdown",
263+
"metadata": {},
264+
"source": [
265+
"`model.fit()` can be run as usual."
193266
]
194267
},
195268
{
@@ -209,22 +282,18 @@
209282
]
210283
},
211284
{
212-
"cell_type": "code",
213-
"execution_count": null,
214-
"metadata": {
215-
"colab": {},
216-
"colab_type": "code",
217-
"id": "P6zyUR7-4fm3"
218-
},
219-
"outputs": [],
220-
"source": []
285+
"cell_type": "markdown",
286+
"metadata": {},
287+
"source": [
288+
"Once everything is working correctly, you can switch to a different device or a different strategy that distributes to multiple devices."
289+
]
221290
}
222291
],
223292
"metadata": {
224293
"accelerator": "GPU",
225294
"colab": {
226295
"collapsed_sections": [],
227-
"name": "OneDeviceStrategyExerciseAnswer.ipynb",
296+
"name": "C2W4_Lab_4_one-device-strategy.ipynb",
228297
"provenance": []
229298
},
230299
"kernelspec": {

0 commit comments

Comments
 (0)