Skip to content

Commit 3f92ffb

Browse files
committed
udpate project 2
1 parent 317476c commit 3f92ffb

File tree

9 files changed

+775
-359
lines changed

9 files changed

+775
-359
lines changed

doc/LectureNotes/week42.ipynb

Lines changed: 327 additions & 325 deletions
Large diffs are not rendered by default.

doc/Projects/2025/Project2/html/._Project2-bs000.html

Lines changed: 54 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,16 @@
8080
3,
8181
None,
8282
'part-g-critical-evaluation-of-the-various-algorithms'),
83+
('Summary of methods to implement and analyze',
84+
2,
85+
None,
86+
'summary-of-methods-to-implement-and-analyze'),
87+
('Required Analysis:', 3, None, 'required-analysis'),
88+
('Optional (Note that you should include at least two of these '
89+
'in the report):',
90+
3,
91+
None,
92+
'optional-note-that-you-should-include-at-least-two-of-these-in-the-report'),
8393
('Background literature', 2, None, 'background-literature'),
8494
('Introduction to numerical projects',
8595
2,
@@ -134,6 +144,9 @@
134144
<!-- navigation toc: --> <li><a href="#part-e-testing-different-norms" style="font-size: 80%;">&nbsp;&nbsp;&nbsp;Part e): Testing different norms</a></li>
135145
<!-- navigation toc: --> <li><a href="#part-f-classification-analysis-using-neural-networks" style="font-size: 80%;">&nbsp;&nbsp;&nbsp;Part f): Classification analysis using neural networks</a></li>
136146
<!-- navigation toc: --> <li><a href="#part-g-critical-evaluation-of-the-various-algorithms" style="font-size: 80%;">&nbsp;&nbsp;&nbsp;Part g) Critical evaluation of the various algorithms</a></li>
147+
<!-- navigation toc: --> <li><a href="#summary-of-methods-to-implement-and-analyze" style="font-size: 80%;"><b>Summary of methods to implement and analyze</b></a></li>
148+
<!-- navigation toc: --> <li><a href="#required-analysis" style="font-size: 80%;">&nbsp;&nbsp;&nbsp;Required Analysis:</a></li>
149+
<!-- navigation toc: --> <li><a href="#optional-note-that-you-should-include-at-least-two-of-these-in-the-report" style="font-size: 80%;">&nbsp;&nbsp;&nbsp;Optional (Note that you should include at least two of these in the report):</a></li>
137150
<!-- navigation toc: --> <li><a href="#background-literature" style="font-size: 80%;"><b>Background literature</b></a></li>
138151
<!-- navigation toc: --> <li><a href="#introduction-to-numerical-projects" style="font-size: 80%;"><b>Introduction to numerical projects</b></a></li>
139152
<!-- navigation toc: --> <li><a href="#format-for-electronic-delivery-of-report-and-programs" style="font-size: 80%;"><b>Format for electronic delivery of report and programs</b></a></li>
@@ -245,7 +258,7 @@ <h2 id="classification-and-regression-writing-our-own-neural-network-code" class
245258
<ul>
246259
<li> Regression (fitting a continuous function). In this part you will need to bring back your results from project 1 and compare these with what you get from your Neural Network code to be developed here. The data sets could be</li>
247260
<ul>
248-
<li> The simple one-dimensional function Runge function from project 1, that is \( f(x) = \frac{1}{1+25x^2} \). We recommend using a simpler function when developing your neural network code for regression problems. Feel however free to discuss and study other functions, such as the the two-dimensional Runge function \( f(x,y)=\left[(10x - 5)^2 + (10y - 5)^2 + 1 \right]^{-1} \), or even more complicated two-dimensional functions (see the supplementary material of <a href="https://www.nature.com/articles/s41467-025-61362-4" target="_self"><tt>https://www.nature.com/articles/s41467-025-61362-4</tt></a> for an extensive list of two-dimensional functions).</li>
261+
<li> The simple one-dimensional function Runge function from project 1, that is \( f(x) = \frac{1}{1+25x^2} \). We recommend using a simpler function when developing your neural network code for regression problems. Feel however free to discuss and study other functions, such as the two-dimensional Runge function \( f(x,y)=\left[(10x - 5)^2 + (10y - 5)^2 + 1 \right]^{-1} \), or even more complicated two-dimensional functions (see the supplementary material of <a href="https://www.nature.com/articles/s41467-025-61362-4" target="_self"><tt>https://www.nature.com/articles/s41467-025-61362-4</tt></a> for an extensive list of two-dimensional functions).</li>
249262
</ul>
250263
<li> Classification.</li>
251264
<ul>
@@ -526,6 +539,46 @@ <h3 id="part-g-critical-evaluation-of-the-various-algorithms" class="anchor">Par
526539
is best for the classification case. These codes can also be part of
527540
your final project 3, but now applied to other data sets.
528541
</p>
542+
<h2 id="summary-of-methods-to-implement-and-analyze" class="anchor">Summary of methods to implement and analyze </h2>
543+
544+
<b>Required Implementation:</b>
545+
<ol>
546+
<li> Reuse the regression code and results from project 1, these will act as a benchmark for seeing how suited a neural network is for this regression task.</li>
547+
<li> Implement a neural network with</li>
548+
<ul>
549+
<li> A flexible number of layers</li>
550+
<li> A flexible number of nodes in each layer</li>
551+
<li> A changeable activation function in each layer (Sigmoid, ReLU, LeakyReLU, as well as Linear and Softmax)</li>
552+
<li> A changeable cost function, which will be set to MSE for regression and cross-entropy for multiple-classification</li>
553+
<li> An optional L1 or L2 norm of the weights and biases in the cost function (only used for computing gradients, not interpretable metrics)</li>
554+
</ul>
555+
<li> Implement the back-propagation algorithm to compute the gradient of your neural network</li>
556+
<li> Reuse the implementation of Plain and Stochastic Gradient Descent from Project 1 (and adapt the code to work with the your neural network)</li>
557+
<ul>
558+
<li> With no optimization algorithm</li>
559+
<li> With RMS Prop</li>
560+
<li> With ADAM</li>
561+
</ul>
562+
<li> Implement scaling and train-test splitting of your data, preferably using sklearn</li>
563+
<li> Implement and compute metrics like the MSE and Accuracy</li>
564+
</ol>
565+
<h3 id="required-analysis" class="anchor">Required Analysis: </h3>
566+
<ol>
567+
<li> Briefly show and argue for the advantages and disadvantages of the methods from Project 1.</li>
568+
<li> Explore and show the impact of changing the number of layers, nodes per layer, choice of activation function, and inclusion of L1 and L2 norms. Present only the most interesting results from this exploration. 2D Heatmaps will be good for this: Start with finding a well performing set of hyper-parameters, then change two at a time in a range that shows good and bad performance.</li>
569+
<li> Show and argue for the advantages and disadvantages of using a neural network for regression on your data</li>
570+
<li> Show and argue for the advantages and disadvantages of using a neural network for classification on your data</li>
571+
<li> Show and argue for the advantages and disadvantages of the different gradient methods and learning rates when training the neural network</li>
572+
</ol>
573+
<h3 id="optional-note-that-you-should-include-at-least-two-of-these-in-the-report" class="anchor">Optional (Note that you should include at least two of these in the report): </h3>
574+
<ol>
575+
<li> Implement Logistic Regression as simple classification model case (equivalent to a Neural Network with one layer)</li>
576+
<li> Compute the gradient of the neural network with autograd, to show that it gives the same result as your hand-written backpropagation.</li>
577+
<li> Compare your results with results from using a machine-learning library like pytorch (https://docs.pytorch.org/tutorials/beginner/basics/buildmodel_tutorial.html)</li>
578+
<li> Use a more complex classification dataset instead, like the fashion MNIST (see <a href="https://www.kaggle.com/datasets/zalando-research/fashionmnist" target="_self"><tt>https://www.kaggle.com/datasets/zalando-research/fashionmnist</tt></a>)</li>
579+
<li> Use a more complex regression dataset instead, like the two-dimensional Runge function \( f(x,y)=\left[(10x - 5)^2 + (10y - 5)^2 + 1 \right]^{-1} \), or even more complicated two-dimensional functions (see the supplementary material of <a href="https://www.nature.com/articles/s41467-025-61362-4" target="_self"><tt>https://www.nature.com/articles/s41467-025-61362-4</tt></a> for an extensive list of two-dimensional functions).</li>
580+
<li> Compute and interpret a confusion matrix of your best classification model (see <a href="https://www.researchgate.net/figure/Confusion-matrix-of-MNIST-and-F-MNIST-embeddings_fig5_349758607" target="_self"><tt>https://www.researchgate.net/figure/Confusion-matrix-of-MNIST-and-F-MNIST-embeddings_fig5_349758607</tt></a>)</li>
581+
</ol>
529582
<h2 id="background-literature" class="anchor">Background literature </h2>
530583

531584
<ol>

doc/Projects/2025/Project2/html/Project2-bs.html

Lines changed: 54 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,16 @@
8080
3,
8181
None,
8282
'part-g-critical-evaluation-of-the-various-algorithms'),
83+
('Summary of methods to implement and analyze',
84+
2,
85+
None,
86+
'summary-of-methods-to-implement-and-analyze'),
87+
('Required Analysis:', 3, None, 'required-analysis'),
88+
('Optional (Note that you should include at least two of these '
89+
'in the report):',
90+
3,
91+
None,
92+
'optional-note-that-you-should-include-at-least-two-of-these-in-the-report'),
8393
('Background literature', 2, None, 'background-literature'),
8494
('Introduction to numerical projects',
8595
2,
@@ -134,6 +144,9 @@
134144
<!-- navigation toc: --> <li><a href="#part-e-testing-different-norms" style="font-size: 80%;">&nbsp;&nbsp;&nbsp;Part e): Testing different norms</a></li>
135145
<!-- navigation toc: --> <li><a href="#part-f-classification-analysis-using-neural-networks" style="font-size: 80%;">&nbsp;&nbsp;&nbsp;Part f): Classification analysis using neural networks</a></li>
136146
<!-- navigation toc: --> <li><a href="#part-g-critical-evaluation-of-the-various-algorithms" style="font-size: 80%;">&nbsp;&nbsp;&nbsp;Part g) Critical evaluation of the various algorithms</a></li>
147+
<!-- navigation toc: --> <li><a href="#summary-of-methods-to-implement-and-analyze" style="font-size: 80%;"><b>Summary of methods to implement and analyze</b></a></li>
148+
<!-- navigation toc: --> <li><a href="#required-analysis" style="font-size: 80%;">&nbsp;&nbsp;&nbsp;Required Analysis:</a></li>
149+
<!-- navigation toc: --> <li><a href="#optional-note-that-you-should-include-at-least-two-of-these-in-the-report" style="font-size: 80%;">&nbsp;&nbsp;&nbsp;Optional (Note that you should include at least two of these in the report):</a></li>
137150
<!-- navigation toc: --> <li><a href="#background-literature" style="font-size: 80%;"><b>Background literature</b></a></li>
138151
<!-- navigation toc: --> <li><a href="#introduction-to-numerical-projects" style="font-size: 80%;"><b>Introduction to numerical projects</b></a></li>
139152
<!-- navigation toc: --> <li><a href="#format-for-electronic-delivery-of-report-and-programs" style="font-size: 80%;"><b>Format for electronic delivery of report and programs</b></a></li>
@@ -245,7 +258,7 @@ <h2 id="classification-and-regression-writing-our-own-neural-network-code" class
245258
<ul>
246259
<li> Regression (fitting a continuous function). In this part you will need to bring back your results from project 1 and compare these with what you get from your Neural Network code to be developed here. The data sets could be</li>
247260
<ul>
248-
<li> The simple one-dimensional function Runge function from project 1, that is \( f(x) = \frac{1}{1+25x^2} \). We recommend using a simpler function when developing your neural network code for regression problems. Feel however free to discuss and study other functions, such as the the two-dimensional Runge function \( f(x,y)=\left[(10x - 5)^2 + (10y - 5)^2 + 1 \right]^{-1} \), or even more complicated two-dimensional functions (see the supplementary material of <a href="https://www.nature.com/articles/s41467-025-61362-4" target="_self"><tt>https://www.nature.com/articles/s41467-025-61362-4</tt></a> for an extensive list of two-dimensional functions).</li>
261+
<li> The simple one-dimensional function Runge function from project 1, that is \( f(x) = \frac{1}{1+25x^2} \). We recommend using a simpler function when developing your neural network code for regression problems. Feel however free to discuss and study other functions, such as the two-dimensional Runge function \( f(x,y)=\left[(10x - 5)^2 + (10y - 5)^2 + 1 \right]^{-1} \), or even more complicated two-dimensional functions (see the supplementary material of <a href="https://www.nature.com/articles/s41467-025-61362-4" target="_self"><tt>https://www.nature.com/articles/s41467-025-61362-4</tt></a> for an extensive list of two-dimensional functions).</li>
249262
</ul>
250263
<li> Classification.</li>
251264
<ul>
@@ -526,6 +539,46 @@ <h3 id="part-g-critical-evaluation-of-the-various-algorithms" class="anchor">Par
526539
is best for the classification case. These codes can also be part of
527540
your final project 3, but now applied to other data sets.
528541
</p>
542+
<h2 id="summary-of-methods-to-implement-and-analyze" class="anchor">Summary of methods to implement and analyze </h2>
543+
544+
<b>Required Implementation:</b>
545+
<ol>
546+
<li> Reuse the regression code and results from project 1, these will act as a benchmark for seeing how suited a neural network is for this regression task.</li>
547+
<li> Implement a neural network with</li>
548+
<ul>
549+
<li> A flexible number of layers</li>
550+
<li> A flexible number of nodes in each layer</li>
551+
<li> A changeable activation function in each layer (Sigmoid, ReLU, LeakyReLU, as well as Linear and Softmax)</li>
552+
<li> A changeable cost function, which will be set to MSE for regression and cross-entropy for multiple-classification</li>
553+
<li> An optional L1 or L2 norm of the weights and biases in the cost function (only used for computing gradients, not interpretable metrics)</li>
554+
</ul>
555+
<li> Implement the back-propagation algorithm to compute the gradient of your neural network</li>
556+
<li> Reuse the implementation of Plain and Stochastic Gradient Descent from Project 1 (and adapt the code to work with the your neural network)</li>
557+
<ul>
558+
<li> With no optimization algorithm</li>
559+
<li> With RMS Prop</li>
560+
<li> With ADAM</li>
561+
</ul>
562+
<li> Implement scaling and train-test splitting of your data, preferably using sklearn</li>
563+
<li> Implement and compute metrics like the MSE and Accuracy</li>
564+
</ol>
565+
<h3 id="required-analysis" class="anchor">Required Analysis: </h3>
566+
<ol>
567+
<li> Briefly show and argue for the advantages and disadvantages of the methods from Project 1.</li>
568+
<li> Explore and show the impact of changing the number of layers, nodes per layer, choice of activation function, and inclusion of L1 and L2 norms. Present only the most interesting results from this exploration. 2D Heatmaps will be good for this: Start with finding a well performing set of hyper-parameters, then change two at a time in a range that shows good and bad performance.</li>
569+
<li> Show and argue for the advantages and disadvantages of using a neural network for regression on your data</li>
570+
<li> Show and argue for the advantages and disadvantages of using a neural network for classification on your data</li>
571+
<li> Show and argue for the advantages and disadvantages of the different gradient methods and learning rates when training the neural network</li>
572+
</ol>
573+
<h3 id="optional-note-that-you-should-include-at-least-two-of-these-in-the-report" class="anchor">Optional (Note that you should include at least two of these in the report): </h3>
574+
<ol>
575+
<li> Implement Logistic Regression as simple classification model case (equivalent to a Neural Network with one layer)</li>
576+
<li> Compute the gradient of the neural network with autograd, to show that it gives the same result as your hand-written backpropagation.</li>
577+
<li> Compare your results with results from using a machine-learning library like pytorch (https://docs.pytorch.org/tutorials/beginner/basics/buildmodel_tutorial.html)</li>
578+
<li> Use a more complex classification dataset instead, like the fashion MNIST (see <a href="https://www.kaggle.com/datasets/zalando-research/fashionmnist" target="_self"><tt>https://www.kaggle.com/datasets/zalando-research/fashionmnist</tt></a>)</li>
579+
<li> Use a more complex regression dataset instead, like the two-dimensional Runge function \( f(x,y)=\left[(10x - 5)^2 + (10y - 5)^2 + 1 \right]^{-1} \), or even more complicated two-dimensional functions (see the supplementary material of <a href="https://www.nature.com/articles/s41467-025-61362-4" target="_self"><tt>https://www.nature.com/articles/s41467-025-61362-4</tt></a> for an extensive list of two-dimensional functions).</li>
580+
<li> Compute and interpret a confusion matrix of your best classification model (see <a href="https://www.researchgate.net/figure/Confusion-matrix-of-MNIST-and-F-MNIST-embeddings_fig5_349758607" target="_self"><tt>https://www.researchgate.net/figure/Confusion-matrix-of-MNIST-and-F-MNIST-embeddings_fig5_349758607</tt></a>)</li>
581+
</ol>
529582
<h2 id="background-literature" class="anchor">Background literature </h2>
530583

531584
<ol>

0 commit comments

Comments
 (0)