You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: doc/LectureNotes/E2.ipynb
+17-35Lines changed: 17 additions & 35 deletions
Original file line number
Diff line number
Diff line change
@@ -224,12 +224,16 @@
224
224
"source": [
225
225
"With the expression for $\\boldsymbol{\\hat{\\beta}_{OLS}}$, you now have what you need to implement OLS regression with your input data and target data $\\boldsymbol{y}$. But before you can do that, you need to set up you input data as a feature matrix $\\boldsymbol{X}$.\n",
226
226
"\n",
227
-
"In a feature matrix, each row is a datapoint and each column is a feature of that data. If you want to predict someones spending based on their income and number of children, for instance, you would create a row for each person in your dataset, and in each column put a 1 for the intercept, the montly income and the number of children."
227
+
"In a feature matrix, each row is a datapoint and each column is a feature of that data. If you want to predict someones spending based on their income and number of children, for instance, you would create a row for each person in your dataset, with the montly income and the number of children as columns.\n",
228
+
"\n",
229
+
"We typically also include an intercept in our models. The intercept is a value that is added to our prediction regardless of the value of the other features. The intercept tries to account for constant effects in our data that are not dependant on anything else. In our current example, the intercept could account for living expenses which are typical regardless of income or childcare expenses.\n",
230
+
"\n",
231
+
"We calculate the optimal intercept by including a feature with the constant value of 1 in our model, which is then multplied by some parameter $\\beta_0$ from the OLS method into the optimal intercept value (which will be $\\beta_0$). In practice, we include the intercept in our model by adding a column of ones to the start of our feature matrix."
Copy file name to clipboardExpand all lines: doc/LectureNotes/_build/html/E1.html
+4-2Lines changed: 4 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -446,7 +446,7 @@ <h2>Exercise 2 - Setting up a Github repository<a class="headerlink" href="#exer
446
446
<h2>Exercise 3 - Setting up a Python virtual environment<aclass="headerlink" href="#exercise-3-setting-up-a-python-virtual-environment" title="Link to this heading">#</a></h2>
447
447
<p>Following the theme of the previous exercises, another way of improving the reproducibility of your results and shareability of your code is having a good handle on which python packages you are using.</p>
448
448
<p>There are many ways to manage your packages in Python, and you are free to use any approach you want, but in this course we encourage you to use something called a virtual environment. A virtual environemnt is a folder in your project which contains a Python runtime executable as well as all the packages you are using in the current project. In this way, each of your projects has its required set of packages installed in the same folder, so that if anything goes wrong while managing your packages it only affects the one project, and if multiple projects require different versions of the same package, you don’t need to worry about messing up old projects. Also, it’s easy to just delete the folder and start over if anything goes wrong.</p>
449
-
<p>Virtual environments are typically created, activated, managed and updated using terminal commands, but for now we recommend that you let VS Code handle it for you to make the coding experience much easier.</p>
449
+
<p>Virtual environments are typically created, activated, managed and updated using terminal commands, but for now we recommend that you let VS Code handle it for you to make the coding experience much easier. If you are familiar with another approach for virtual environments that works for you, feel free to keep doing it that way.</p>
450
450
<p><strong>a)</strong> Open this notebook in VS Code (<aclass="reference external" href="https://code.visualstudio.com/Download">https://code.visualstudio.com/Download</a>). Download the Python and Jupyter extensions.</p>
451
451
<p><strong>b)</strong> Press ´Cmd + Shift + P´, then search and run ´Python: Create Environment…´</p>
452
452
<p><strong>c)</strong> Select ´Venv´</p>
@@ -483,10 +483,12 @@ <h2>Exercise 3 - Fitting an OLS model to data<a class="headerlink" href="#exerci
<h2>Exercise 3 - Creating feature matrix and implementing OLS using the analytical expression<aclass="headerlink" href="#exercise-3-creating-feature-matrix-and-implementing-ols-using-the-analytical-expression" title="Link to this heading">#</a></h2>
499
499
<p>With the expression for <spanclass="math notranslate nohighlight">\(\boldsymbol{\hat{\beta}_{OLS}}\)</span>, you now have what you need to implement OLS regression with your input data and target data <spanclass="math notranslate nohighlight">\(\boldsymbol{y}\)</span>. But before you can do that, you need to set up you input data as a feature matrix <spanclass="math notranslate nohighlight">\(\boldsymbol{X}\)</span>.</p>
500
-
<p>In a feature matrix, each row is a datapoint and each column is a feature of that data. If you want to predict someones spending based on their income and number of children, for instance, you would create a row for each person in your dataset, and in each column put a 1 for the intercept, the montly income and the number of children.</p>
500
+
<p>In a feature matrix, each row is a datapoint and each column is a feature of that data. If you want to predict someones spending based on their income and number of children, for instance, you would create a row for each person in your dataset, with the montly income and the number of children as columns.</p>
501
+
<p>We typically also include an intercept in our models. The intercept is a value that is added to our prediction regardless of the value of the other features. The intercept tries to account for constant effects in our data that are not dependant on anything else. In our current example, the intercept could account for living expenses which are typical regardless of income or childcare expenses.</p>
502
+
<p>We calculate the optimal intercept by including a feature with the constant value of 1 in our model, which is then multplied by some parameter <spanclass="math notranslate nohighlight">\(\beta_0\)</span> from the OLS method into the optimal intercept value (which will be <spanclass="math notranslate nohighlight">\(\beta_0\)</span>). In practice, we include the intercept in our model by adding a column of ones to the start of our feature matrix.</p>
<p><strong>e)</strong> Do the same for each polynomial degree from 2 to 10, and plot the MSE on both the training and test data as a function of polynomial degree. The aim is to reproduce Figure 2.11 of <aclass="reference external" href="https://github.com/CompPhysics/MLErasmus/blob/master/doc/Textbooks/elementsstat.pdf">Hastie et al</a>. Feel free to read the discussions leading to figure 2.11 of Hastie et al.</p>
0 commit comments