updated exercises week37

mhjensen · mhjensen · commit f202dd1b7ccd · 2025-09-07T14:54:43.000+02:00
diff --git a/doc/LectureNotes/exercisesweek37.ipynb b/doc/LectureNotes/exercisesweek37.ipynb
@@ -2,7 +2,7 @@
  "cells": [
   {
    "cell_type": "markdown",
-   "id": "1b941c35",
+   "id": "8e6632a0",
    "metadata": {
     "editable": true
    },
@@ -14,7 +14,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "dc05b096",
+   "id": "82705c4f",
    "metadata": {
     "editable": true
    },
@@ -27,7 +27,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "2cf07405",
+   "id": "921bf331",
    "metadata": {
     "editable": true
    },
@@ -46,7 +46,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "3c139edb",
+   "id": "adff65d5",
    "metadata": {
     "editable": true
    },
@@ -58,7 +58,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "aad4cfac",
+   "id": "70418b3d",
    "metadata": {
     "editable": true
    },
@@ -70,7 +70,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "6682282f",
+   "id": "11a3cf73",
    "metadata": {
     "editable": true
    },
@@ -83,7 +83,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "89e2f4c4",
+   "id": "04a06b51",
    "metadata": {
     "editable": true
    },
@@ -99,7 +99,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "b06d4e53",
+   "id": "408db3d9",
    "metadata": {
     "editable": true
    },
@@ -120,7 +120,7 @@
   {
    "cell_type": "code",
    "execution_count": 1,
-   "id": "63796480",
+   "id": "37fb732c",
    "metadata": {
     "collapsed": false,
     "editable": true
@@ -140,12 +140,12 @@
   },
   {
    "cell_type": "markdown",
-   "id": "80748600",
+   "id": "d861e1e3",
    "metadata": {
     "editable": true
    },
    "source": [
-    "Fill in the necessary details.\n",
+    "Fill in the necessary details. Do we need to center the $y$-values? \n",
     "\n",
     "After this preprocessing, each column of $\\boldsymbol{X}_{\\mathrm{norm}}$ has mean zero and standard deviation $1$\n",
     "and $\\boldsymbol{y}_{\\mathrm{centered}}$ has mean 0. This makes the optimization landscape\n",
@@ -156,7 +156,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "92751e5f",
+   "id": "b3e774d0",
    "metadata": {
     "editable": true
    },
@@ -168,7 +168,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "aedfbd7a",
+   "id": "d5dc7708",
    "metadata": {
     "editable": true
    },
@@ -179,15 +179,17 @@
   {
    "cell_type": "code",
    "execution_count": 2,
-   "id": "5d1288fa",
+   "id": "4c9c86ac",
    "metadata": {
     "collapsed": false,
     "editable": true
    },
    "outputs": [],
    "source": [
     "# Set regularization parameter, either a single value or a vector of values\n",
-    "lambda = ?\n",
+    "# Note that lambda is a python keyword. The lambda keyword is used to create small, single-expression functions without a formal name. These are often called \"anonymous functions\" or \"lambda functions.\"\n",
+    "lam = ?\n",
+    "\n",
     "\n",
     "# Analytical form for OLS and Ridge solution: theta_Ridge = (X^T X + lambda * I)^{-1} X^T y and theta_OLS = (X^T X)^{-1} X^T y\n",
     "I = np.eye(n_features)\n",
@@ -200,7 +202,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "628f5e89",
+   "id": "eeae00fd",
    "metadata": {
     "editable": true
    },
@@ -214,7 +216,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "f115ba4e",
+   "id": "e1c215d5",
    "metadata": {
     "editable": true
    },
@@ -226,7 +228,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "a9b5189c",
+   "id": "587dd3dc",
    "metadata": {
     "editable": true
    },
@@ -238,7 +240,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "a3969ff6",
+   "id": "bfa34697",
    "metadata": {
     "editable": true
    },
@@ -258,7 +260,7 @@
   {
    "cell_type": "code",
    "execution_count": 3,
-   "id": "34d87303",
+   "id": "49245f55",
    "metadata": {
     "collapsed": false,
     "editable": true
@@ -273,19 +275,8 @@
     "# Initialize weights for gradient descent\n",
     "theta = np.zeros(n_features)\n",
     "\n",
-    "# Arrays to store history for plotting\n",
-    "cost_history = np.zeros(num_iters)\n",
-    "\n",
     "# Gradient descent loop\n",
-    "m = n_samples  # number of data points\n",
     "for t in range(num_iters):\n",
-    "    # Compute prediction error\n",
-    "    error = X_norm.dot(theta) - y_centered \n",
-    "    # Compute cost for OLS and Ridge (MSE + regularization for Ridge) for monitoring\n",
-    "    cost_OLS = ?\n",
-    "    cost_Ridge = ?\n",
-    "    # You could add a history for both methods (optional)\n",
-    "    cost_history[t] = ?\n",
     "    # Compute gradients for OSL and Ridge\n",
     "    grad_OLS = ?\n",
     "    grad_Ridge = ?\n",
@@ -302,31 +293,33 @@
   },
   {
    "cell_type": "markdown",
-   "id": "989f70bb",
+   "id": "f3f43f2c",
    "metadata": {
     "editable": true
    },
    "source": [
     "### 4a)\n",
     "\n",
-    "Discuss the results as function of the learning rate parameters and the number of iterations."
+    "Write first a gradient descent code for OLS only using the above template.\n",
+    "Discuss the results as function of the learning rate parameters and the number of iterations"
    ]
   },
   {
    "cell_type": "markdown",
-   "id": "370b2dad",
+   "id": "9ba303be",
    "metadata": {
     "editable": true
    },
    "source": [
     "### 4b)\n",
     "\n",
+    "Write then a similar code for Ridge regression using the above template.\n",
     "Try to add a stopping parameter as function of the number iterations and the difference between the new and old $\\theta$ values. How would you define a stopping criterion?"
    ]
   },
   {
    "cell_type": "markdown",
-   "id": "ef197cd7",
+   "id": "78362c6c",
    "metadata": {
     "editable": true
    },
@@ -352,7 +345,7 @@
   {
    "cell_type": "code",
    "execution_count": 4,
-   "id": "4ccc2f65",
+   "id": "8be1cebe",
    "metadata": {
     "collapsed": false,
     "editable": true
@@ -381,7 +374,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "00e279ef",
+   "id": "e2693666",
    "metadata": {
     "editable": true
    },
@@ -395,7 +388,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "c910b3f4",
+   "id": "bc954d12",
    "metadata": {
     "editable": true
    },
@@ -407,7 +400,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "89e6e040",
+   "id": "6534b610",
    "metadata": {
     "editable": true
    },
diff --git a/doc/src/week37/exercisesweek37.do.txt b/doc/src/week37/exercisesweek37.do.txt
@@ -59,7 +59,7 @@ y_mean = ?
 y_centered = ?
 !ec
 
-Fill in the necessary details.
+Fill in the necessary details. Do we need to center the $y$-values? 
 
 After this preprocessing, each column of $\bm{X}_{\mathrm{norm}}$ has mean zero and standard deviation $1$
 and $\bm{y}_{\mathrm{centered}}$ has mean 0. This makes the optimization landscape
@@ -76,7 +76,9 @@ Find the gradients for OLS and Ridge regression using the mean-squared error as
 
 !bc pycod
 # Set regularization parameter, either a single value or a vector of values
-lambda = ?
+# Note that lambda is a python keyword. The lambda keyword is used to create small, single-expression functions without a formal name. These are often called "anonymous functions" or "lambda functions."
+lam = ?
+
 
 # Analytical form for OLS and Ridge solution: theta_Ridge = (X^T X + lambda * I)^{-1} X^T y and theta_OLS = (X^T X)^{-1} X^T y
 I = np.eye(n_features)
@@ -120,19 +122,8 @@ num_iters = 1000
 # Initialize weights for gradient descent
 theta = np.zeros(n_features)
 
-# Arrays to store history for plotting
-cost_history = np.zeros(num_iters)
-
 # Gradient descent loop
-m = n_samples  # number of data points
 for t in range(num_iters):
-    # Compute prediction error
-    error = X_norm.dot(theta) - y_centered 
-    # Compute cost for OLS and Ridge (MSE + regularization for Ridge) for monitoring
-    cost_OLS = ?
-    cost_Ridge = ?
-    # You could add a history for both methods (optional)
-    cost_history[t] = ?
     # Compute gradients for OSL and Ridge
     grad_OLS = ?
     grad_Ridge = ?
@@ -148,9 +139,11 @@ print("Gradient Descent Ridge coefficients:", theta_gdRidge)
 !ec
 
 === 4a) ===
-Discuss the results as function of the learning rate parameters and the number of iterations.
+Write first a gradient descent code for OLS only using the above template.
+Discuss the results as function of the learning rate parameters and the number of iterations
 
 === 4b) ===
+Write then a similar code for Ridge regression using the above template.
 Try to add a stopping parameter as function of the number iterations and the difference between the new and old $\theta$ values. How would you define a stopping criterion? 
 
 
diff --git a/doc/src/week37/week37.do.txt b/doc/src/week37/week37.do.txt
@@ -1556,9 +1556,8 @@ print(theta)
 =====  Material for the lab sessions  =====
 
 
-
-!bblock  Material for the lab  sessions on Tuesday and Wednesday
-o Exercise set for week 37
+!bblock  
+o Exercise set for week 37 and reminder on scaling (from lab sessions of week 35)
 o Work on project 1
 #  * "Video of exercise sessions week 37":"https://youtu.be/bK4AEcTu-oM"
   * For more discussions of Ridge regression and calculation of averages, "Wessel van Wieringen's":"https://arxiv.org/abs/1509.09169" article is highly recommended.