Skip to content

Commit d65046c

Browse files
committed
update
1 parent b68f2a9 commit d65046c

23 files changed

+850
-856
lines changed

doc/pub/week35/html/._week35-bs002.html

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -323,13 +323,12 @@
323323
<!-- !split -->
324324
<h2 id="reminder-from-last-week" class="anchor">Reminder from last week </h2>
325325

326-
<p>We need first a reminder from last week about linear regression. </p>
326+
<p>We need first a reminder from last week about linear regression. We are going to fit a continuous function with linear parameterization in terms of the parameters \( \boldsymbol{\theta} \) and our first encounter is ordinary least squares.</p>
327327

328-
<p>Fitting a continuous function with linear parameterization in terms of the parameters \( \boldsymbol{\beta} \).</p>
329328
<ul>
330-
<li> Method of choice for fitting a continuous function!</li>
329+
<li> It is the method of choice for fitting a continuous function</li>
331330
<li> Gives an excellent introduction to central Machine Learning features with <b>understandable pedagogical</b> links to other methods like <b>Neural Networks</b>, <b>Support Vector Machines</b> etc</li>
332-
<li> Analytical expression for the fitting parameters \( \boldsymbol{\beta} \)</li>
331+
<li> Analytical expression for the fitting parameters \( \boldsymbol{\theta} \)</li>
333332
<li> Analytical expressions for statistical propertiers like mean values, variances, confidence intervals and more</li>
334333
<li> Analytical relation with probabilistic interpretations</li>
335334
<li> Easy to introduce basic concepts like bias-variance tradeoff, cross-validation, resampling and regularization techniques and many other ML topics</li>

doc/pub/week35/html/._week35-bs003.html

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -346,19 +346,19 @@ <h2 id="the-equations-for-ordinary-least-squares" class="anchor">The equations f
346346
<p>In linear regression we approximate the unknown function with another
347347
continuous function \( \tilde{\boldsymbol{y}}(\boldsymbol{x}) \) which depends linearly on
348348
some unknown parameters
349-
\( \boldsymbol{\beta}^T=[\beta_0,\beta_1,\beta_2,\dots,\beta_{p-1}] \).
349+
\( \boldsymbol{\theta}^T=[\theta_0,\theta_1,\theta_2,\dots,\theta_{p-1}] \).
350350
</p>
351351

352352
<p>Last week we introduced the so-called design matrix in order to define
353353
the approximation \( \boldsymbol{\tilde{y}} \) via the unknown quantity
354-
\( \boldsymbol{\beta} \) as
354+
\( \boldsymbol{\theta} \) as
355355
</p>
356356

357357
$$
358-
\boldsymbol{\tilde{y}}= \boldsymbol{X}\boldsymbol{\beta},
358+
\boldsymbol{\tilde{y}}= \boldsymbol{X}\boldsymbol{\theta},
359359
$$
360360

361-
<p>and in order to find the optimal parameters \( \beta_i \) we defined a function which
361+
<p>and in order to find the optimal parameters \( \theta_i \) we defined a function which
362362
gives a measure of the spread between the values \( y_i \) (which
363363
represent the output values we want to reproduce) and the parametrized
364364
values \( \tilde{y}_i \), namely the so-called cost/loss function.

doc/pub/week35/html/._week35-bs004.html

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -325,12 +325,12 @@ <h2 id="the-cost-loss-function" class="anchor">The cost/loss function </h2>
325325

326326
<p>We used the mean squared error to define the way we measure the quality of our model</p>
327327
$$
328-
C(\boldsymbol{\beta})=\frac{1}{n}\sum_{i=0}^{n-1}\left(y_i-\tilde{y}_i\right)^2=\frac{1}{n}\left\{\left(\boldsymbol{y}-\boldsymbol{\tilde{y}}\right)^T\left(\boldsymbol{y}-\boldsymbol{\tilde{y}}\right)\right\},
328+
C(\boldsymbol{\theta})=\frac{1}{n}\sum_{i=0}^{n-1}\left(y_i-\tilde{y}_i\right)^2=\frac{1}{n}\left\{\left(\boldsymbol{y}-\boldsymbol{\tilde{y}}\right)^T\left(\boldsymbol{y}-\boldsymbol{\tilde{y}}\right)\right\},
329329
$$
330330

331331
<p>or using the matrix \( \boldsymbol{X} \) and in a more compact matrix-vector notation as</p>
332332
$$
333-
C(\boldsymbol{\beta})=\frac{1}{n}\left\{\left(\boldsymbol{y}-\boldsymbol{X}\boldsymbol{\beta}\right)^T\left(\boldsymbol{y}-\boldsymbol{X}\boldsymbol{\beta}\right)\right\}.
333+
C(\boldsymbol{\theta})=\frac{1}{n}\left\{\left(\boldsymbol{y}-\boldsymbol{X}\boldsymbol{\theta}\right)^T\left(\boldsymbol{y}-\boldsymbol{X}\boldsymbol{\theta}\right)\right\}.
334334
$$
335335

336336
<p>This function represents one of many possible ways to define the so-called cost function.</p>
@@ -340,10 +340,10 @@ <h2 id="the-cost-loss-function" class="anchor">The cost/loss function </h2>
340340
</p>
341341

342342
$$
343-
C(\boldsymbol{\beta})=\frac{1}{2n}\sum_{i=0}^{n-1}\left(y_i-\tilde{y}_i\right)^2,
343+
C(\boldsymbol{\theta})=\frac{1}{2n}\sum_{i=0}^{n-1}\left(y_i-\tilde{y}_i\right)^2,
344344
$$
345345

346-
<p>since when taking the first derivative with respect to the unknown parameters \( \beta \), the factor of \( 2 \) cancels out. </p>
346+
<p>since when taking the first derivative with respect to the unknown parameters \( \theta \), the factor of \( 2 \) cancels out. </p>
347347

348348
<p>
349349
<!-- navigation buttons at the bottom of the page -->

doc/pub/week35/html/._week35-bs005.html

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -325,14 +325,14 @@ <h2 id="interpretations-and-optimizing-our-parameters" class="anchor">Interpreta
325325

326326
<p>The function </p>
327327
$$
328-
C(\boldsymbol{\beta})=\frac{1}{n}\left\{\left(\boldsymbol{y}-\boldsymbol{X}\boldsymbol{\beta}\right)^T\left(\boldsymbol{y}-\boldsymbol{X}\boldsymbol{\beta}\right)\right\},
328+
C(\boldsymbol{\theta})=\frac{1}{n}\left\{\left(\boldsymbol{y}-\boldsymbol{X}\boldsymbol{\theta}\right)^T\left(\boldsymbol{y}-\boldsymbol{X}\boldsymbol{\theta}\right)\right\},
329329
$$
330330

331331
<p>can be linked to the variance of the quantity \( y_i \) if we interpret the latter as the mean value.
332332
When linking (see the discussions next week) with the maximum likelihood approach below, we will indeed interpret \( y_i \) as a mean value
333333
</p>
334334
$$
335-
y_{i}=\langle y_i \rangle = \beta_0x_{i,0}+\beta_1x_{i,1}+\beta_2x_{i,2}+\dots+\beta_{n-1}x_{i,n-1}+\epsilon_i,
335+
y_{i}=\langle y_i \rangle = \theta_0x_{i,0}+\theta_1x_{i,1}+\theta_2x_{i,2}+\dots+\theta_{n-1}x_{i,n-1}+\epsilon_i,
336336
$$
337337

338338
<p>where \( \langle y_i \rangle \) is the mean value. Keep in mind also that
@@ -345,25 +345,25 @@ <h2 id="interpretations-and-optimizing-our-parameters" class="anchor">Interpreta
345345
will treat \( y_i \) as our exact value for the response variable.
346346
</p>
347347

348-
<p>In order to find the parameters \( \beta_i \) we will then minimize the spread of \( C(\boldsymbol{\beta}) \), that is we are going to solve the problem</p>
348+
<p>In order to find the parameters \( \theta_i \) we will then minimize the spread of \( C(\boldsymbol{\theta}) \), that is we are going to solve the problem</p>
349349
$$
350-
{\displaystyle \min_{\boldsymbol{\beta}\in
351-
{\mathbb{R}}^{p}}}\frac{1}{n}\left\{\left(\boldsymbol{y}-\boldsymbol{X}\boldsymbol{\beta}\right)^T\left(\boldsymbol{y}-\boldsymbol{X}\boldsymbol{\beta}\right)\right\}.
350+
{\displaystyle \min_{\boldsymbol{\theta}\in
351+
{\mathbb{R}}^{p}}}\frac{1}{n}\left\{\left(\boldsymbol{y}-\boldsymbol{X}\boldsymbol{\theta}\right)^T\left(\boldsymbol{y}-\boldsymbol{X}\boldsymbol{\theta}\right)\right\}.
352352
$$
353353

354354
<p>In practical terms it means we will require</p>
355355
$$
356-
\frac{\partial C(\boldsymbol{\beta})}{\partial \beta_j} = \frac{\partial }{\partial \beta_j}\left[ \frac{1}{n}\sum_{i=0}^{n-1}\left(y_i-\beta_0x_{i,0}-\beta_1x_{i,1}-\beta_2x_{i,2}-\dots-\beta_{n-1}x_{i,n-1}\right)^2\right]=0,
356+
\frac{\partial C(\boldsymbol{\theta})}{\partial \theta_j} = \frac{\partial }{\partial \theta_j}\left[ \frac{1}{n}\sum_{i=0}^{n-1}\left(y_i-\theta_0x_{i,0}-\theta_1x_{i,1}-\theta_2x_{i,2}-\dots-\theta_{n-1}x_{i,n-1}\right)^2\right]=0,
357357
$$
358358

359359
<p>which results in</p>
360360
$$
361-
\frac{\partial C(\boldsymbol{\beta})}{\partial \beta_j} = -\frac{2}{n}\left[ \sum_{i=0}^{n-1}x_{ij}\left(y_i-\beta_0x_{i,0}-\beta_1x_{i,1}-\beta_2x_{i,2}-\dots-\beta_{n-1}x_{i,n-1}\right)\right]=0,
361+
\frac{\partial C(\boldsymbol{\theta})}{\partial \theta_j} = -\frac{2}{n}\left[ \sum_{i=0}^{n-1}x_{ij}\left(y_i-\theta_0x_{i,0}-\theta_1x_{i,1}-\theta_2x_{i,2}-\dots-\theta_{n-1}x_{i,n-1}\right)\right]=0,
362362
$$
363363

364364
<p>or in a matrix-vector form as (multiplying away the factor \( -2/n \), see derivation below)</p>
365365
$$
366-
\frac{\partial C(\boldsymbol{\beta})}{\partial \boldsymbol{\beta}^T} = 0 = \boldsymbol{X}^T\left( \boldsymbol{y}-\boldsymbol{X}\boldsymbol{\beta}\right).
366+
\frac{\partial C(\boldsymbol{\theta})}{\partial \boldsymbol{\theta}^T} = 0 = \boldsymbol{X}^T\left( \boldsymbol{y}-\boldsymbol{X}\boldsymbol{\theta}\right).
367367
$$
368368

369369

doc/pub/week35/html/._week35-bs006.html

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -327,17 +327,17 @@ <h2 id="interpretations-and-optimizing-our-parameters" class="anchor">Interpreta
327327
<!-- subsequent paragraphs come in larger fonts, so start with a paragraph -->
328328
<p>We can rewrite, see the derivations below, </p>
329329
$$
330-
\frac{\partial C(\boldsymbol{\beta})}{\partial \boldsymbol{\beta}^T} = 0 = \boldsymbol{X}^T\left( \boldsymbol{y}-\boldsymbol{X}\boldsymbol{\beta}\right),
330+
\frac{\partial C(\boldsymbol{\theta})}{\partial \boldsymbol{\theta}^T} = 0 = \boldsymbol{X}^T\left( \boldsymbol{y}-\boldsymbol{X}\boldsymbol{\theta}\right),
331331
$$
332332

333333
<p>as</p>
334334
$$
335-
\boldsymbol{X}^T\boldsymbol{y} = \boldsymbol{X}^T\boldsymbol{X}\boldsymbol{\beta},
335+
\boldsymbol{X}^T\boldsymbol{y} = \boldsymbol{X}^T\boldsymbol{X}\boldsymbol{\theta},
336336
$$
337337

338338
<p>and if the matrix \( \boldsymbol{X}^T\boldsymbol{X} \) is invertible we have the solution</p>
339339
$$
340-
\boldsymbol{\beta} =\left(\boldsymbol{X}^T\boldsymbol{X}\right)^{-1}\boldsymbol{X}^T\boldsymbol{y}.
340+
\boldsymbol{\theta} =\left(\boldsymbol{X}^T\boldsymbol{X}\right)^{-1}\boldsymbol{X}^T\boldsymbol{y}.
341341
$$
342342

343343
<p>We note also that since our design matrix is defined as \( \boldsymbol{X}\in

doc/pub/week35/html/._week35-bs013.html

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -325,46 +325,46 @@ <h2 id="the-mean-squared-error-and-its-derivative" class="anchor">The mean squar
325325

326326
<p>We defined earlier a possible cost function using the mean squared error</p>
327327
$$
328-
C(\boldsymbol{\beta})=\frac{1}{n}\sum_{i=0}^{n-1}\left(y_i-\tilde{y}_i\right)^2=\frac{1}{n}\left\{\left(\boldsymbol{y}-\boldsymbol{\tilde{y}}\right)^T\left(\boldsymbol{y}-\boldsymbol{\tilde{y}}\right)\right\},
328+
C(\boldsymbol{\theta})=\frac{1}{n}\sum_{i=0}^{n-1}\left(y_i-\tilde{y}_i\right)^2=\frac{1}{n}\left\{\left(\boldsymbol{y}-\boldsymbol{\tilde{y}}\right)^T\left(\boldsymbol{y}-\boldsymbol{\tilde{y}}\right)\right\},
329329
$$
330330

331331
<p>or using the design/feature matrix \( \boldsymbol{X} \) we have the more compact matrix-vector</p>
332332
$$
333-
C(\boldsymbol{\beta})=\frac{1}{n}\left\{\left(\boldsymbol{y}-\boldsymbol{X}\boldsymbol{\beta}\right)^T\left(\boldsymbol{y}-\boldsymbol{X}\boldsymbol{\beta}\right)\right\}.
333+
C(\boldsymbol{\theta})=\frac{1}{n}\left\{\left(\boldsymbol{y}-\boldsymbol{X}\boldsymbol{\theta}\right)^T\left(\boldsymbol{y}-\boldsymbol{X}\boldsymbol{\theta}\right)\right\}.
334334
$$
335335

336-
<p>We note that the design matrix \( \boldsymbol{X} \) does not depend on the unknown parameters defined by the vector \( \boldsymbol{\beta} \).
337-
We are now interested in minimizing the cost function with respect to the unknown parameters \( \boldsymbol{\beta} \).
336+
<p>We note that the design matrix \( \boldsymbol{X} \) does not depend on the unknown parameters defined by the vector \( \boldsymbol{\theta} \).
337+
We are now interested in minimizing the cost function with respect to the unknown parameters \( \boldsymbol{\theta} \).
338338
</p>
339339

340340
<p>The mean squared error is a scalar and if we use the results from example three above, we can define a new vector</p>
341341
$$
342-
\boldsymbol{w}=\boldsymbol{y}-\boldsymbol{X}\boldsymbol{\beta},
342+
\boldsymbol{w}=\boldsymbol{y}-\boldsymbol{X}\boldsymbol{\theta},
343343
$$
344344

345-
<p>which depends on \( \boldsymbol{\beta} \). We rewrite the cost function as</p>
345+
<p>which depends on \( \boldsymbol{\theta} \). We rewrite the cost function as</p>
346346
$$
347-
C(\boldsymbol{\beta})=\frac{1}{n}\boldsymbol{w}^T\boldsymbol{w},
347+
C(\boldsymbol{\theta})=\frac{1}{n}\boldsymbol{w}^T\boldsymbol{w},
348348
$$
349349

350350
<p>with partial derivative</p>
351351
$$
352-
\frac{\partial C(\boldsymbol{\beta})}{\partial \boldsymbol{\beta}}=\frac{2}{n}\boldsymbol{w}^T\frac{\partial \boldsymbol{w}}{\partial \boldsymbol{\beta}},
352+
\frac{\partial C(\boldsymbol{\theta})}{\partial \boldsymbol{\theta}}=\frac{2}{n}\boldsymbol{w}^T\frac{\partial \boldsymbol{w}}{\partial \boldsymbol{\theta}},
353353
$$
354354

355355
<p>and using that</p>
356356
$$
357-
\frac{\partial \boldsymbol{w}}{\partial \boldsymbol{\beta}}=-\boldsymbol{X},
357+
\frac{\partial \boldsymbol{w}}{\partial \boldsymbol{\theta}}=-\boldsymbol{X},
358358
$$
359359

360360
<p>where we used the result from example two above. Inserting the last expression we obtain</p>
361361
$$
362-
\frac{\partial C(\boldsymbol{\beta})}{\partial \boldsymbol{\beta}}=-\frac{2}{n}\left(\boldsymbol{y}-\boldsymbol{X}\boldsymbol{\beta}\right)^T\boldsymbol{X},
362+
\frac{\partial C(\boldsymbol{\theta})}{\partial \boldsymbol{\theta}}=-\frac{2}{n}\left(\boldsymbol{y}-\boldsymbol{X}\boldsymbol{\theta}\right)^T\boldsymbol{X},
363363
$$
364364

365365
<p>or as</p>
366366
$$
367-
\frac{\partial C(\boldsymbol{\beta})}{\partial \boldsymbol{\beta}^T}=-\frac{2}{n}\boldsymbol{X}^T\left(\boldsymbol{y}-\boldsymbol{X}\boldsymbol{\beta}\right).
367+
\frac{\partial C(\boldsymbol{\theta})}{\partial \boldsymbol{\theta}^T}=-\frac{2}{n}\boldsymbol{X}^T\left(\boldsymbol{y}-\boldsymbol{X}\boldsymbol{\theta}\right).
368368
$$
369369

370370

doc/pub/week35/html/._week35-bs015.html

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -325,13 +325,13 @@ <h2 id="meet-the-hessian-matrix" class="anchor">Meet the Hessian Matrix </h2>
325325

326326
<p>A very important matrix we will meet again and again in machine
327327
learning is the Hessian. It is given by the second derivative of the
328-
cost function with respect to the parameters \( \boldsymbol{\beta} \). Using the above
328+
cost function with respect to the parameters \( \boldsymbol{\theta} \). Using the above
329329
expression for derivatives of vectors and matrices, we find that the
330330
second derivative of the mean squared error as cost function is,
331331
</p>
332332

333333
$$
334-
\frac{\partial}{\partial \boldsymbol{\beta}}\frac{\partial C(\boldsymbol{\beta})}{\partial \boldsymbol{\beta}^T} =\frac{\partial}{\partial \boldsymbol{\beta}}\left[-\frac{2}{n}\boldsymbol{X}^T\left( \boldsymbol{y}-\boldsymbol{X}\boldsymbol{\beta}\right)\right]=\frac{2}{n}\boldsymbol{X}^T\boldsymbol{X}.
334+
\frac{\partial}{\partial \boldsymbol{\theta}}\frac{\partial C(\boldsymbol{\theta})}{\partial \boldsymbol{\theta}^T} =\frac{\partial}{\partial \boldsymbol{\theta}}\left[-\frac{2}{n}\boldsymbol{X}^T\left( \boldsymbol{y}-\boldsymbol{X}\boldsymbol{\theta}\right)\right]=\frac{2}{n}\boldsymbol{X}^T\boldsymbol{X}.
335335
$$
336336

337337
<p>The Hessian matrix plays an important role and is defined here as</p>
@@ -342,7 +342,7 @@ <h2 id="meet-the-hessian-matrix" class="anchor">Meet the Hessian Matrix </h2>
342342

343343
<p>For ordinary least squares, it is inversely proportional (derivation
344344
next week) with the variance of the optimal parameters
345-
\( \hat{\boldsymbol{\beta}} \). Furthermore, we will see later this week that it is
345+
\( \hat{\boldsymbol{\theta}} \). Furthermore, we will see later this week that it is
346346
(aside the factor \( 1/n \)) equal to the covariance matrix. It plays also a very
347347
important role in optmization algorithms and Principal Component
348348
Analysis as a way to reduce the dimensionality of a machine learning/data analysis

doc/pub/week35/html/._week35-bs016.html

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -328,20 +328,20 @@ <h2 id="interpretations-and-optimizing-our-parameters" class="anchor">Interpreta
328328
<!-- subsequent paragraphs come in larger fonts, so start with a paragraph -->
329329
<p>The residuals \( \boldsymbol{\epsilon} \) are in turn given by</p>
330330
$$
331-
\boldsymbol{\epsilon} = \boldsymbol{y}-\boldsymbol{\tilde{y}} = \boldsymbol{y}-\boldsymbol{X}\boldsymbol{\beta},
331+
\boldsymbol{\epsilon} = \boldsymbol{y}-\boldsymbol{\tilde{y}} = \boldsymbol{y}-\boldsymbol{X}\boldsymbol{\theta},
332332
$$
333333

334334
<p>and with </p>
335335
$$
336-
\boldsymbol{X}^T\left( \boldsymbol{y}-\boldsymbol{X}\boldsymbol{\beta}\right)= 0,
336+
\boldsymbol{X}^T\left( \boldsymbol{y}-\boldsymbol{X}\boldsymbol{\theta}\right)= 0,
337337
$$
338338

339339
<p>we have</p>
340340
$$
341-
\boldsymbol{X}^T\boldsymbol{\epsilon}=\boldsymbol{X}^T\left( \boldsymbol{y}-\boldsymbol{X}\boldsymbol{\beta}\right)= 0,
341+
\boldsymbol{X}^T\boldsymbol{\epsilon}=\boldsymbol{X}^T\left( \boldsymbol{y}-\boldsymbol{X}\boldsymbol{\theta}\right)= 0,
342342
$$
343343

344-
<p>meaning that the solution for \( \boldsymbol{\beta} \) is the one which minimizes the residuals. </p>
344+
<p>meaning that the solution for \( \boldsymbol{\theta} \) is the one which minimizes the residuals. </p>
345345
</div>
346346
</div>
347347

doc/pub/week35/html/._week35-bs017.html

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -328,10 +328,10 @@ <h2 id="example-relevant-for-the-exercises" class="anchor">Example relevant for
328328
We assume our data can represented by a fourth-order polynomial. For the $i$th component we have
329329
</p>
330330
$$
331-
\tilde{y}_i = \beta_0+\beta_1x_i+\beta_2x_i^2+\beta_3x_i^3+\beta_4x_i^4.
331+
\tilde{y}_i = \theta_0+\theta_1x_i+\theta_2x_i^2+\theta_3x_i^3+\theta_4x_i^4.
332332
$$
333333

334-
<p>we have five predictors/features. The first is the intercept \( \beta_0 \). The other terms are \( \beta_i \) with \( i=1,2,3,4 \). Furthermore we have \( n \) entries for each predictor. It means that our design matrix is an
334+
<p>we have five predictors/features. The first is the intercept \( \theta_0 \). The other terms are \( \theta_i \) with \( i=1,2,3,4 \). Furthermore we have \( n \) entries for each predictor. It means that our design matrix is an
335335
\( n\times p \) matrix \( \boldsymbol{X} \).
336336
</p>
337337

doc/pub/week35/html/._week35-bs018.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -323,7 +323,7 @@
323323
<!-- !split -->
324324
<h2 id="own-code-for-ordinary-least-squares" class="anchor">Own code for Ordinary Least Squares </h2>
325325

326-
<p>It is rather straightforward to implement the matrix inversion and obtain the parameters \( \boldsymbol{\beta} \). After having defined the matrix \( \boldsymbol{X} \) and the outputs \( \boldsymbol{y} \) we have </p>
326+
<p>It is rather straightforward to implement the matrix inversion and obtain the parameters \( \boldsymbol{\theta} \). After having defined the matrix \( \boldsymbol{X} \) and the outputs \( \boldsymbol{y} \) we have </p>
327327

328328
<!-- code=python (!bc pycod) typeset with pygments style "default" -->
329329
<div class="cell border-box-sizing code_cell rendered">

0 commit comments

Comments
 (0)