Skip to content

Commit b5bd221

Browse files
committed
update
1 parent 5d8ff11 commit b5bd221

File tree

10 files changed

+1864
-3671
lines changed

10 files changed

+1864
-3671
lines changed
-682 Bytes
Binary file not shown.
-1.78 KB
Binary file not shown.

doc/LectureNotes/_build/html/_sources/week36.ipynb

Lines changed: 365 additions & 403 deletions
Large diffs are not rendered by default.

doc/LectureNotes/_build/html/searchindex.js

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

doc/LectureNotes/_build/html/week36.html

Lines changed: 3 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -419,7 +419,6 @@ <h2> Contents </h2>
419419
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#the-hessian-matrix">The Hessian matrix</a></li>
420420
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#simple-program">Simple program</a></li>
421421
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#id1">Gradient Descent Example</a></li>
422-
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#and-a-corresponding-example-using-scikit-learn">And a corresponding example using <strong>scikit-learn</strong></a></li>
423422
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#gradient-descent-and-ridge">Gradient descent and Ridge</a></li>
424423
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#the-hessian-matrix-for-ridge-regression">The Hessian matrix for Ridge Regression</a></li>
425424
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#program-example-for-gradient-descent-with-ridge-regression">Program example for gradient descent with Ridge Regression</a></li>
@@ -631,7 +630,7 @@ <h2>Fixing the singularity<a class="headerlink" href="#fixing-the-singularity" t
631630
\]</div>
632631
<p>has linearly dependent column vectors, we will not be able to compute the inverse
633632
of <span class="math notranslate nohighlight">\(\boldsymbol{X}^T\boldsymbol{X}\)</span> and we cannot find the parameters (estimators) <span class="math notranslate nohighlight">\(\theta_i\)</span>.
634-
The estimators are only well-defined if <span class="math notranslate nohighlight">\((\boldsymbol{X}^{T}\boldsymbol{X})^{-1}\)</span> exits.
633+
The estimators are only well-defined if <span class="math notranslate nohighlight">\((\boldsymbol{X}^{T}\boldsymbol{X})^{-1}\)</span> exists.
635634
This is more likely to happen when the matrix <span class="math notranslate nohighlight">\(\boldsymbol{X}\)</span> is high-dimensional. In this case it is likely to encounter a situation where
636635
the regression parameters <span class="math notranslate nohighlight">\(\theta_i\)</span> cannot be estimated.</p>
637636
<p>A cheap <em>ad hoc</em> approach is simply to add a small diagonal component to the matrix to invert, that is we change</p>
@@ -1239,12 +1238,12 @@ <h2>Deriving the Lasso Regression Equations<a class="headerlink" href="#derivin
12391238
<p>and reordering we have</p>
12401239
<div class="math notranslate nohighlight">
12411240
\[
1242-
\boldsymbol{X}^T\boldsymbol{X}\boldsymbol{\theta}+\frac{n}{2}\lambda sgn(\boldsymbol{\theta})=2\boldsymbol{X}^T\boldsymbol{y}.
1241+
\boldsymbol{X}^T\boldsymbol{X}\boldsymbol{\theta}+\frac{n}{2}\lambda sgn(\boldsymbol{\theta})=\boldsymbol{X}^T\boldsymbol{y}.
12431242
\]</div>
12441243
<p>We can redefine <span class="math notranslate nohighlight">\(\lambda\)</span> to absorb the constant <span class="math notranslate nohighlight">\(n/2\)</span> and we rewrite the last equation as</p>
12451244
<div class="math notranslate nohighlight">
12461245
\[
1247-
\boldsymbol{X}^T\boldsymbol{X}\boldsymbol{\theta}+\lambda sgn(\boldsymbol{\theta})=2\boldsymbol{X}^T\boldsymbol{y}.
1246+
\boldsymbol{X}^T\boldsymbol{X}\boldsymbol{\theta}+\lambda sgn(\boldsymbol{\theta})=\boldsymbol{X}^T\boldsymbol{y}.
12481247
\]</div>
12491248
<p>This equation does not lead to a nice analytical equation as in either Ridge regression or ordinary least squares. This equation can however be solved by using standard convex optimization algorithms.We will discuss how to code the above methods using gradient descent methods.</p>
12501249
</section>
@@ -1643,31 +1642,6 @@ <h2>Gradient Descent Example<a class="headerlink" href="#id1" title="Link to thi
16431642
</div>
16441643
</div>
16451644
</section>
1646-
<section id="and-a-corresponding-example-using-scikit-learn">
1647-
<h2>And a corresponding example using <strong>scikit-learn</strong><a class="headerlink" href="#and-a-corresponding-example-using-scikit-learn" title="Link to this heading">#</a></h2>
1648-
<div class="cell docutils container">
1649-
<div class="cell_input docutils container">
1650-
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span># Importing various packages
1651-
from random import random, seed
1652-
import numpy as np
1653-
import matplotlib.pyplot as plt
1654-
from sklearn.linear_model import SGDRegressor
1655-
1656-
n = 100
1657-
x = 2*np.random.rand(n,1)
1658-
y = 4+3*x+np.random.randn(n,1)
1659-
1660-
X = np.c_[np.ones((n,1)), x]
1661-
theta_linreg = np.linalg.inv(X.T @ X) @ (X.T @ y)
1662-
print(theta_linreg)
1663-
sgdreg = SGDRegressor(max_iter = 50, penalty=None, eta0=0.1)
1664-
sgdreg.fit(x,y.ravel())
1665-
print(sgdreg.intercept_, sgdreg.coef_)
1666-
</pre></div>
1667-
</div>
1668-
</div>
1669-
</div>
1670-
</section>
16711645
<section id="gradient-descent-and-ridge">
16721646
<h2>Gradient descent and Ridge<a class="headerlink" href="#gradient-descent-and-ridge" title="Link to this heading">#</a></h2>
16731647
<p>We have also discussed Ridge regression where the loss function contains a regularized term given by the <span class="math notranslate nohighlight">\(L_2\)</span> norm of <span class="math notranslate nohighlight">\(\theta\)</span>,</p>
@@ -2488,7 +2462,6 @@ <h2>Another Example, now with a polynomial fit<a class="headerlink" href="#anoth
24882462
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#the-hessian-matrix">The Hessian matrix</a></li>
24892463
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#simple-program">Simple program</a></li>
24902464
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#id1">Gradient Descent Example</a></li>
2491-
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#and-a-corresponding-example-using-scikit-learn">And a corresponding example using <strong>scikit-learn</strong></a></li>
24922465
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#gradient-descent-and-ridge">Gradient descent and Ridge</a></li>
24932466
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#the-hessian-matrix-for-ridge-regression">The Hessian matrix for Ridge Regression</a></li>
24942467
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#program-example-for-gradient-descent-with-ridge-regression">Program example for gradient descent with Ridge Regression</a></li>

0 commit comments

Comments
 (0)