C2 W4 Lab 4: added markdown and code comments

cfav-dev · cfav-dev · commit 19bf983e0819 · 2020-11-21T11:40:15.000+08:00
diff --git a/Course 2 - Custom Training loops, Gradients and Distributed Training/Week 4 - Distribution Strategy/C2_W4_Lab_4_one-device-strategy.ipynb b/Course 2 - Custom Training loops, Gradients and Distributed Training/Week 4 - Distribution Strategy/C2_W4_Lab_4_one-device-strategy.ipynb
@@ -13,7 +13,7 @@
    "source": [
     "# One Device Strategy \n",
     "\n",
-    "In this ungraded lab you'll learn to set up a One Device Strategy"
+    "In this ungraded lab, you'll learn how to set up a [One Device Strategy](https://www.tensorflow.org/api_docs/python/tf/distribute/OneDeviceStrategy). This is typically used to deliberately test your code on a single device. This can be used before switching to a different strategy that distributes across multiple devices. Please click on the **Open in Colab** badge above so you can download the datasets and use a GPU-enabled lab environment."
    ]
   },
   {
@@ -42,15 +42,44 @@
     "import tensorflow as tf\n",
     "import tensorflow_hub as hub\n",
     "import tensorflow_datasets as tfds\n",
-    "tfds.disable_progress_bar()\n",
-    "devices = tf.config.list_physical_devices('CPU')\n",
-    "cpu_name = devices[0].name\n",
     "\n",
-    "devices = tf.config.list_physical_devices('CPU')\n",
+    "tfds.disable_progress_bar()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Define the Distribution Strategy\n",
+    "\n",
+    "You can list available devices in your machine and specify a device type. This allows you to verify the device name to pass in `tf.distribute.OneDeviceStrategy()`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# choose a device type such as CPU or GPU\n",
+    "devices = tf.config.list_physical_devices('GPU')\n",
     "print(devices[0])\n",
-    "# You'll see that the name will look something like \"/physical_device:CPU:0\"\n",
-    "# Just take the CPU:0 part and use that as the name\n",
-    "cpu_name = \"CPU:0\""
+    "\n",
+    "# You'll see that the name will look something like \"/physical_device:GPU:0\"\n",
+    "# Just take the GPU:0 part and use that as the name\n",
+    "gpu_name = \"GPU:0\"\n",
+    "\n",
+    "# define the strategy and pass in the device name\n",
+    "one_strategy = tf.distribute.OneDeviceStrategy(device=gpu_name)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Parameters\n",
+    "\n",
+    "We'll define a few global variables for setting up the model and dataset."
    ]
   },
   {
@@ -66,9 +95,20 @@
     "pixels = 224\n",
     "MODULE_HANDLE = 'https://tfhub.dev/tensorflow/resnet_50/feature_vector/1'\n",
     "IMAGE_SIZE = (pixels, pixels)\n",
+    "BATCH_SIZE = 32\n",
+    "\n",
     "print(\"Using {} with input size {}\".format(MODULE_HANDLE, IMAGE_SIZE))"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Download and Prepare the Dataset\n",
+    "\n",
+    "We will use the [Cats vs Dogs](https://www.tensorflow.org/datasets/catalog/cats_vs_dogs) dataset and we will fetch it via TFDS."
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": null,
@@ -80,12 +120,11 @@
    "outputs": [],
    "source": [
     "splits = ['train[:80%]', 'train[80%:90%]', 'train[90%:]']\n",
-    "#data, info = tfds.load('cats_vs_dogs', with_info=True, as_supervised=True, split=splits)\n",
     "\n",
     "(train_examples, validation_examples, test_examples), info = tfds.load('cats_vs_dogs', with_info=True, as_supervised=True, split=splits)\n",
     "\n",
     "num_examples = info.splits['train'].num_examples\n",
-    "num_classes = info.features['label'].num_classes\n"
+    "num_classes = info.features['label'].num_classes"
    ]
   },
   {
@@ -98,9 +137,10 @@
    },
    "outputs": [],
    "source": [
+    "# resize the image and normalize pixel values\n",
     "def format_image(image, label):\n",
-    "  image = tf.image.resize(image, IMAGE_SIZE) / 255.0\n",
-    "  return  image, label"
+    "    image = tf.image.resize(image, IMAGE_SIZE) / 255.0\n",
+    "    return  image, label"
    ]
   },
   {
@@ -113,7 +153,7 @@
    },
    "outputs": [],
    "source": [
-    "BATCH_SIZE = 32\n",
+    "# prepare batches\n",
     "train_batches = train_examples.shuffle(num_examples // 4).map(format_image).batch(BATCH_SIZE).prefetch(1)\n",
     "validation_batches = validation_examples.map(format_image).batch(BATCH_SIZE).prefetch(1)\n",
     "test_batches = test_examples.map(format_image).batch(1)"
@@ -129,10 +169,20 @@
    },
    "outputs": [],
    "source": [
+    "# check if the batches have the correct size and the images have the correct shape\n",
     "for image_batch, label_batch in train_batches.take(1):\n",
-    "  pass\n",
+    "    pass\n",
     "\n",
-    "image_batch.shape"
+    "print(image_batch.shape)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Define and Configure the Model\n",
+    "\n",
+    "As with other strategies, setting up the model requires minimal code changes. Let's first define a utility function to build and compile the model."
    ]
   },
   {
@@ -145,7 +195,8 @@
    },
    "outputs": [],
    "source": [
-    "do_fine_tuning = False #@param {type:\"boolean\"}"
+    "# tells if we want to freeze the layer weights of our feature extractor during training\n",
+    "do_fine_tuning = False"
    ]
   },
   {
@@ -159,22 +210,37 @@
    "outputs": [],
    "source": [
     "def build_and_compile_model():\n",
-    "  print(\"Building model with\", MODULE_HANDLE)\n",
-    "  feature_extractor = hub.KerasLayer(MODULE_HANDLE,\n",
+    "    print(\"Building model with\", MODULE_HANDLE)\n",
+    "\n",
+    "    # configures the feature extractor fetched from TF Hub\n",
+    "    feature_extractor = hub.KerasLayer(MODULE_HANDLE,\n",
     "                                   input_shape=IMAGE_SIZE + (3,), \n",
     "                                   trainable=do_fine_tuning)\n",
-    "  model = tf.keras.Sequential([\n",
+    "\n",
+    "    # define the model\n",
+    "    model = tf.keras.Sequential([\n",
     "      feature_extractor,\n",
+    "      # append a dense with softmax for the number of classes\n",
     "      tf.keras.layers.Dense(num_classes, activation='softmax')\n",
-    "  ])\n",
-    "  model.summary()\n",
+    "    ])\n",
     "\n",
-    "  optimizer = tf.keras.optimizers.SGD(lr=0.002, momentum=0.9) if do_fine_tuning else 'adam'\n",
-    "  model.compile(optimizer=optimizer,\n",
+    "    # display summary\n",
+    "    model.summary()\n",
+    "\n",
+    "    # configure the optimizer, loss and metrics\n",
+    "    optimizer = tf.keras.optimizers.SGD(lr=0.002, momentum=0.9) if do_fine_tuning else 'adam'\n",
+    "    model.compile(optimizer=optimizer,\n",
     "                loss='sparse_categorical_crossentropy',\n",
     "                metrics=['accuracy'])\n",
-    "  \n",
-    "  return model"
+    "\n",
+    "    return model"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "You can now call the function under the strategy scope. This places variables and computations on the device you specified earlier."
    ]
   },
   {
@@ -187,9 +253,16 @@
    },
    "outputs": [],
    "source": [
-    "one_strategy = tf.distribute.OneDeviceStrategy(device=cpu_name)\n",
+    "# build and compile under the strategy scope\n",
     "with one_strategy.scope():\n",
-    "  model = build_and_compile_model()"
+    "    model = build_and_compile_model()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "`model.fit()` can be run as usual."
    ]
   },
   {
@@ -209,22 +282,18 @@
    ]
   },
   {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {
-    "colab": {},
-    "colab_type": "code",
-    "id": "P6zyUR7-4fm3"
-   },
-   "outputs": [],
-   "source": []
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Once everything is working correctly, you can switch to a different device or a different strategy that distributes to multiple devices."
+   ]
   }
  ],
  "metadata": {
   "accelerator": "GPU",
   "colab": {
    "collapsed_sections": [],
-   "name": "OneDeviceStrategyExerciseAnswer.ipynb",
+   "name": "C2W4_Lab_4_one-device-strategy.ipynb",
    "provenance": []
   },
   "kernelspec": {