carpentries-lab · qualiaMachine · May 15, 2025 · carschno · Sep 2, 2025 · carschno
diff --git a/episodes/5-transfer-learning.md b/episodes/5-transfer-learning.md
@@ -17,7 +17,7 @@

 An example: Let's say that you want to train a model to classify images of different dog breeds. You could make use of a pre-trained network that learned how to classify images of dogs and cats. The pre-trained network will not know anything about different dog breeds, but it will have captured some general knowledge of, on a high-level, what dogs look like, and on a low-level all the different features (eyes, ears, paws, fur) that make up an image of a dog. Further training this model on your dog breed dataset is a much easier task than training from scratch, because the model can use the general knowledge captured in the pre-trained network.

 ![](episodes/fig/05-transfer_learning.png)
 <!-- 
 Edit this plot using the Mermaid live editor:
 1. Open this link that includes the source code of the chart to open the live editor web interface:
@@ -258,6 +258,120 @@
 ::::
 :::
 
+::: challenge
+## Fine-Tune the Top Layer of the Pretrained Model
+
+So far, we've trained only the custom head while keeping the DenseNet121 base frozen. Let's now **unfreeze just the top layer group** of the base model and observe how performance changes.
+
+### 1. Unfreeze top layers
+Unfreeze just the final convolutional block of the base model using:
+
+```python
+# 1. Unfreeze top block of base model
+set_trainable = False
+for layer in base_model.layers:
+    if 'conv5' in layer.name:
+        set_trainable = True
+    layer.trainable = set_trainable
+```
+
+### 2. Recompile the model
+Any time you change layer trainability, you **must recompile** the model.
+
+Use the same optimizer and loss function as before:
+- `optimizer='adam'`
+- `loss=SparseCategoricalCrossentropy(from_logits=True)`
+- `metrics=['accuracy']`
+
+### 3. Retrain the model
+Retrain the model using the same setup as before:
+
+- `batch_size=32`
+- `epochs=30`
+- Early stopping with `patience=5`
+- Pass in the validation set using `validation_data`
+- Store the result in a new variable called `history_finetune`
+
+> You can reuse your `early_stopper` callback or redefine it.
+
+### 4. Compare with baseline (head only)
+Plot the **validation accuracy** for both the baseline and fine-tuned models.
+
+**Questions to reflect on:**
+- Did unfreezing part of the base model improve validation accuracy?
+- Did training time increase significantly?
+- Is there any evidence of overfitting?
+
+:::: solution
+## Solution
+```python
+# 1. Unfreeze top block of base model
+set_trainable = False
+for layer in base_model.layers:
+    if 'conv5' in layer.name:
+        set_trainable = True
+    else:
+        set_trainable = False
+    layer.trainable = set_trainable
+
+# 2. Recompile the model
+model.compile(optimizer='adam',
+              loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
+              metrics=['accuracy'])
+
+# 3. Retrain the model
+early_stopper = keras.callbacks.EarlyStopping(monitor='val_accuracy', patience=5)
+history_finetune = model.fit(train_images, train_labels,
+                             batch_size=32,
+                             epochs=30,
+                             validation_data=(val_images, val_labels),
+                             callbacks=[early_stopper])
+
+# 4. Plot comparison
+def plot_two_histories(h1, h2, label1='Frozen', label2='Finetuned'):
+    import pandas as pd
+    import matplotlib.pyplot as plt
+    df1 = pd.DataFrame(h1.history)
+    df2 = pd.DataFrame(h2.history)
+    plt.plot(df1['val_accuracy'], label=label1)
+    plt.plot(df2['val_accuracy'], label=label2)
+    plt.xlabel("Epochs")
+    plt.ylabel("Validation Accuracy")
+    plt.legend()
+    plt.title("Validation Accuracy: Frozen vs. Finetuned")
+    plt.show()
+
+plot_two_histories(history, history_finetune)
+
+```
+
+![](episodes/fig/05-frozen_vs_finetuned.png)
-![](episodes/fig/05-frozen_vs_finetuned.png)
+![](episodes/fig/05-frozen_vs_finetuned.png){alt="A comparison of the accuracy on the validation set for both the frozen and the fine-tuned setup."}
-![](episodes/fig/05-frozen_vs_finetuned.png)
+![](episodes/fig/05-frozen_vs_finetuned.png){alt="A comparison of the accuracy on the validation set for both the frozen and the fine-tuned setup."}
+
+**Discussion of results**: Validation accuracy improved across all epochs compared to the frozen baseline. Training time also increased slightly, but the model was able to adapt better to the new dataset by fine-tuning the top convolutional block.
+
+This makes sense: by unfreezing the last part of the base model, you're allowing it to adjust high-level features to the new domain, while still keeping the earlier, general-purpose filters/feature-detectors of the model intact.
+
+
+**What happens if you unfreeze too many layers?**
+If you unfreeze most or all of the base model:
+
+- Training time increases significantly because more weights are being updated.
+- The model may forget some of the general-purpose features it learned during pretraining. This is called "catastrophic forgetting."
+- Overfitting becomes more likely, especially if your dataset is small or noisy.
+
+
+### When does this approach work best?
+
+Fine-tuning a few top layers is a good middle ground. You're adapting the model without retraining everything from scratch. If your dataset is small or very different from the original ImageNet data, you should be careful not to unfreeze too many layers.
+
+For most use cases:
+- Freeze most layers
+- Unfreeze the top block or two
+- Avoid full fine-tuning unless you have lots of data and compute
+
+::::
+:::
+
 ## Concluding: The power of transfer learning
 In many domains, large networks are available that have been trained on vast amounts of data, such as in computer vision and natural language processing. Using transfer learning, you can benefit from the knowledge that was captured from another machine learning task. In many fields, transfer learning will outperform models trained from scratch, especially if your dataset is small or of poor quality.
 

diff --git a/episodes/fig/05-frozen_vs_finetuned.png b/episodes/fig/05-frozen_vs_finetuned.png