|
2 | 2 | "cells": [ |
3 | 3 | { |
4 | 4 | "cell_type": "markdown", |
5 | | - "id": "061af572", |
| 5 | + "id": "96e577ca", |
6 | 6 | "metadata": { |
7 | 7 | "editable": true |
8 | 8 | }, |
|
14 | 14 | }, |
15 | 15 | { |
16 | 16 | "cell_type": "markdown", |
17 | | - "id": "22578683", |
| 17 | + "id": "067c02b9", |
18 | 18 | "metadata": { |
19 | 19 | "editable": true |
20 | 20 | }, |
|
27 | 27 | }, |
28 | 28 | { |
29 | 29 | "cell_type": "markdown", |
30 | | - "id": "61fb162f", |
| 30 | + "id": "01f9fedd", |
31 | 31 | "metadata": { |
32 | 32 | "editable": true |
33 | 33 | }, |
|
58 | 58 | }, |
59 | 59 | { |
60 | 60 | "cell_type": "markdown", |
61 | | - "id": "104c69e1", |
| 61 | + "id": "9f8e4871", |
62 | 62 | "metadata": { |
63 | 63 | "editable": true |
64 | 64 | }, |
|
104 | 104 | }, |
105 | 105 | { |
106 | 106 | "cell_type": "markdown", |
107 | | - "id": "0d2c42e3", |
| 107 | + "id": "460cc6ea", |
108 | 108 | "metadata": { |
109 | 109 | "editable": true |
110 | 110 | }, |
|
121 | 121 | "\n", |
122 | 122 | "* Regression (fitting a continuous function). In this part you will need to bring back your results from project 1 and compare these with what you get from your Neural Network code to be developed here. The data sets could be\n", |
123 | 123 | "\n", |
124 | | - " * The simple one-dimensional function Runge function from project 1, that is $f(x) = \\frac{1}{1+25x^2}$. We recommend using a simpler function when developing your neural network code for regression problems. Feel however free to discuss and study other functions, such as the the two-dimensional Runge function $f(x,y)=\\left[(10x - 5)^2 + (10y - 5)^2 + 1 \\right]^{-1}$, or even more complicated two-dimensional functions (see the supplementary material of <https://www.nature.com/articles/s41467-025-61362-4> for an extensive list of two-dimensional functions). \n", |
| 124 | + " * The simple one-dimensional function Runge function from project 1, that is $f(x) = \\frac{1}{1+25x^2}$. We recommend using a simpler function when developing your neural network code for regression problems. Feel however free to discuss and study other functions, such as the two-dimensional Runge function $f(x,y)=\\left[(10x - 5)^2 + (10y - 5)^2 + 1 \\right]^{-1}$, or even more complicated two-dimensional functions (see the supplementary material of <https://www.nature.com/articles/s41467-025-61362-4> for an extensive list of two-dimensional functions). \n", |
125 | 125 | "\n", |
126 | 126 | "* Classification.\n", |
127 | 127 | "\n", |
|
132 | 132 | }, |
133 | 133 | { |
134 | 134 | "cell_type": "markdown", |
135 | | - "id": "d8baab67", |
| 135 | + "id": "d62a07ef", |
136 | 136 | "metadata": { |
137 | 137 | "editable": true |
138 | 138 | }, |
|
162 | 162 | }, |
163 | 163 | { |
164 | 164 | "cell_type": "markdown", |
165 | | - "id": "87e7ed71", |
| 165 | + "id": "9cd8b8ac", |
166 | 166 | "metadata": { |
167 | 167 | "editable": true |
168 | 168 | }, |
|
189 | 189 | }, |
190 | 190 | { |
191 | 191 | "cell_type": "markdown", |
192 | | - "id": "5a26b6ad", |
| 192 | + "id": "5931b155", |
193 | 193 | "metadata": { |
194 | 194 | "editable": true |
195 | 195 | }, |
|
205 | 205 | }, |
206 | 206 | { |
207 | 207 | "cell_type": "markdown", |
208 | | - "id": "096fe6c4", |
| 208 | + "id": "b273fc8a", |
209 | 209 | "metadata": { |
210 | 210 | "editable": true |
211 | 211 | }, |
|
217 | 217 | }, |
218 | 218 | { |
219 | 219 | "cell_type": "markdown", |
220 | | - "id": "fd986596", |
| 220 | + "id": "e13db1ec", |
221 | 221 | "metadata": { |
222 | 222 | "editable": true |
223 | 223 | }, |
|
252 | 252 | }, |
253 | 253 | { |
254 | 254 | "cell_type": "markdown", |
255 | | - "id": "e853d4b6", |
| 255 | + "id": "4f864e31", |
256 | 256 | "metadata": { |
257 | 257 | "editable": true |
258 | 258 | }, |
|
270 | 270 | }, |
271 | 271 | { |
272 | 272 | "cell_type": "markdown", |
273 | | - "id": "fc2d413b", |
| 273 | + "id": "c9faeafd", |
274 | 274 | "metadata": { |
275 | 275 | "editable": true |
276 | 276 | }, |
|
285 | 285 | }, |
286 | 286 | { |
287 | 287 | "cell_type": "markdown", |
288 | | - "id": "e6821051", |
| 288 | + "id": "d865c22b", |
289 | 289 | "metadata": { |
290 | 290 | "editable": true |
291 | 291 | }, |
|
302 | 302 | }, |
303 | 303 | { |
304 | 304 | "cell_type": "markdown", |
305 | | - "id": "cba72d68", |
| 305 | + "id": "5270af8f", |
306 | 306 | "metadata": { |
307 | 307 | "editable": true |
308 | 308 | }, |
|
328 | 328 | { |
329 | 329 | "cell_type": "code", |
330 | 330 | "execution_count": 1, |
331 | | - "id": "e16fb528", |
| 331 | + "id": "4e0e1fea", |
332 | 332 | "metadata": { |
333 | 333 | "collapsed": false, |
334 | 334 | "editable": true |
|
347 | 347 | }, |
348 | 348 | { |
349 | 349 | "cell_type": "markdown", |
350 | | - "id": "73599f42", |
| 350 | + "id": "8fe85677", |
351 | 351 | "metadata": { |
352 | 352 | "editable": true |
353 | 353 | }, |
|
358 | 358 | { |
359 | 359 | "cell_type": "code", |
360 | 360 | "execution_count": 2, |
361 | | - "id": "f1a639ef", |
| 361 | + "id": "b28318b2", |
362 | 362 | "metadata": { |
363 | 363 | "collapsed": false, |
364 | 364 | "editable": true |
|
370 | 370 | }, |
371 | 371 | { |
372 | 372 | "cell_type": "markdown", |
373 | | - "id": "90fb7b41", |
| 373 | + "id": "97e02c71", |
374 | 374 | "metadata": { |
375 | 375 | "editable": true |
376 | 376 | }, |
|
381 | 381 | { |
382 | 382 | "cell_type": "code", |
383 | 383 | "execution_count": 3, |
384 | | - "id": "424af629", |
| 384 | + "id": "88af355c", |
385 | 385 | "metadata": { |
386 | 386 | "collapsed": false, |
387 | 387 | "editable": true |
|
394 | 394 | }, |
395 | 395 | { |
396 | 396 | "cell_type": "markdown", |
397 | | - "id": "3c006080", |
| 397 | + "id": "d1f8f0ed", |
398 | 398 | "metadata": { |
399 | 399 | "editable": true |
400 | 400 | }, |
|
407 | 407 | }, |
408 | 408 | { |
409 | 409 | "cell_type": "markdown", |
410 | | - "id": "a18ddd54", |
| 410 | + "id": "554b3a48", |
411 | 411 | "metadata": { |
412 | 412 | "editable": true |
413 | 413 | }, |
|
419 | 419 | }, |
420 | 420 | { |
421 | 421 | "cell_type": "markdown", |
422 | | - "id": "1a1afaf9", |
| 422 | + "id": "77bfdd5c", |
423 | 423 | "metadata": { |
424 | 424 | "editable": true |
425 | 425 | }, |
|
434 | 434 | "code for classification and pertinent results against a similar code using **Scikit-Learn** or **tensorflow/keras** or **pytorch**.\n", |
435 | 435 | "\n", |
436 | 436 | "If you have time, you can use the functionality of **scikit-learn** and compare your neural network results with those from Logistic regression. This is optional.\n", |
437 | | - "The weblink here <https://medium.com/ai-in-plain-english/comparison-between-logistic-regression-and-neural-networks-in-classifying-digits-dc5e85cd93c3>compares logistic regression and FFNN using the so-called MNIST data set. You may find several useful hints and ideas from this article. Your neural network code can implement the equivalent of logistic regression by simply setting the number of hidden layers to zero. \n", |
| 437 | + "The weblink here <https://medium.com/ai-in-plain-english/comparison-between-logistic-regression-and-neural-networks-in-classifying-digits-dc5e85cd93c3>compares logistic regression and FFNN using the so-called MNIST data set. You may find several useful hints and ideas from this article. Your neural network code can implement the equivalent of logistic regression by simply setting the number of hidden layers to zero and keeping just the input and the output layers. \n", |
438 | 438 | "\n", |
439 | 439 | "If you wish to compare with say Logisti Regression from **scikit-learn**, the following code uses the above data set" |
440 | 440 | ] |
441 | 441 | }, |
442 | 442 | { |
443 | 443 | "cell_type": "code", |
444 | 444 | "execution_count": 4, |
445 | | - "id": "3c37cbaf", |
| 445 | + "id": "eaa9e72e", |
446 | 446 | "metadata": { |
447 | 447 | "collapsed": false, |
448 | 448 | "editable": true |
|
464 | 464 | }, |
465 | 465 | { |
466 | 466 | "cell_type": "markdown", |
467 | | - "id": "106b9303", |
| 467 | + "id": "c7ba883e", |
468 | 468 | "metadata": { |
469 | 469 | "editable": true |
470 | 470 | }, |
|
480 | 480 | }, |
481 | 481 | { |
482 | 482 | "cell_type": "markdown", |
483 | | - "id": "55da0d7f", |
| 483 | + "id": "595be693", |
| 484 | + "metadata": { |
| 485 | + "editable": true |
| 486 | + }, |
| 487 | + "source": [ |
| 488 | + "## Summary of methods to implement and analyze\n", |
| 489 | + "\n", |
| 490 | + "**Required Implementation:**\n", |
| 491 | + "1. Reuse the regression code and results from project 1, these will act as a benchmark for seeing how suited a neural network is for this regression task.\n", |
| 492 | + "\n", |
| 493 | + "2. Implement a neural network with\n", |
| 494 | + "\n", |
| 495 | + " * A flexible number of layers\n", |
| 496 | + "\n", |
| 497 | + " * A flexible number of nodes in each layer\n", |
| 498 | + "\n", |
| 499 | + " * A changeable activation function in each layer (Sigmoid, ReLU, LeakyReLU, as well as Linear and Softmax)\n", |
| 500 | + "\n", |
| 501 | + " * A changeable cost function, which will be set to MSE for regression and cross-entropy for multiple-classification\n", |
| 502 | + "\n", |
| 503 | + " * An optional L1 or L2 norm of the weights and biases in the cost function (only used for computing gradients, not interpretable metrics)\n", |
| 504 | + "\n", |
| 505 | + "3. Implement the back-propagation algorithm to compute the gradient of your neural network\n", |
| 506 | + "\n", |
| 507 | + "4. Reuse the implementation of Plain and Stochastic Gradient Descent from Project 1 (and adapt the code to work with the your neural network)\n", |
| 508 | + "\n", |
| 509 | + " * With no optimization algorithm\n", |
| 510 | + "\n", |
| 511 | + " * With RMS Prop\n", |
| 512 | + "\n", |
| 513 | + " * With ADAM\n", |
| 514 | + "\n", |
| 515 | + "5. Implement scaling and train-test splitting of your data, preferably using sklearn\n", |
| 516 | + "\n", |
| 517 | + "6. Implement and compute metrics like the MSE and Accuracy" |
| 518 | + ] |
| 519 | + }, |
| 520 | + { |
| 521 | + "cell_type": "markdown", |
| 522 | + "id": "35138b41", |
| 523 | + "metadata": { |
| 524 | + "editable": true |
| 525 | + }, |
| 526 | + "source": [ |
| 527 | + "### Required Analysis:\n", |
| 528 | + "\n", |
| 529 | + "1. Briefly show and argue for the advantages and disadvantages of the methods from Project 1.\n", |
| 530 | + "\n", |
| 531 | + "2. Explore and show the impact of changing the number of layers, nodes per layer, choice of activation function, and inclusion of L1 and L2 norms. Present only the most interesting results from this exploration. 2D Heatmaps will be good for this: Start with finding a well performing set of hyper-parameters, then change two at a time in a range that shows good and bad performance.\n", |
| 532 | + "\n", |
| 533 | + "3. Show and argue for the advantages and disadvantages of using a neural network for regression on your data\n", |
| 534 | + "\n", |
| 535 | + "4. Show and argue for the advantages and disadvantages of using a neural network for classification on your data\n", |
| 536 | + "\n", |
| 537 | + "5. Show and argue for the advantages and disadvantages of the different gradient methods and learning rates when training the neural network" |
| 538 | + ] |
| 539 | + }, |
| 540 | + { |
| 541 | + "cell_type": "markdown", |
| 542 | + "id": "b18bea03", |
| 543 | + "metadata": { |
| 544 | + "editable": true |
| 545 | + }, |
| 546 | + "source": [ |
| 547 | + "### Optional (Note that you should include at least two of these in the report):\n", |
| 548 | + "\n", |
| 549 | + "1. Implement Logistic Regression as simple classification model case (equivalent to a Neural Network with just the output layer)\n", |
| 550 | + "\n", |
| 551 | + "2. Compute the gradient of the neural network with autograd, to show that it gives the same result as your hand-written backpropagation.\n", |
| 552 | + "\n", |
| 553 | + "3. Compare your results with results from using a machine-learning library like pytorch (https://docs.pytorch.org/tutorials/beginner/basics/buildmodel_tutorial.html)\n", |
| 554 | + "\n", |
| 555 | + "4. Use a more complex classification dataset instead, like the fashion MNIST (see <https://www.kaggle.com/datasets/zalando-research/fashionmnist>)\n", |
| 556 | + "\n", |
| 557 | + "5. Use a more complex regression dataset instead, like the two-dimensional Runge function $f(x,y)=\\left[(10x - 5)^2 + (10y - 5)^2 + 1 \\right]^{-1}$, or even more complicated two-dimensional functions (see the supplementary material of <https://www.nature.com/articles/s41467-025-61362-4> for an extensive list of two-dimensional functions). \n", |
| 558 | + "\n", |
| 559 | + "6. Compute and interpret a confusion matrix of your best classification model (see <https://www.researchgate.net/figure/Confusion-matrix-of-MNIST-and-F-MNIST-embeddings_fig5_349758607>)" |
| 560 | + ] |
| 561 | + }, |
| 562 | + { |
| 563 | + "cell_type": "markdown", |
| 564 | + "id": "580d8424", |
484 | 565 | "metadata": { |
485 | 566 | "editable": true |
486 | 567 | }, |
|
496 | 577 | }, |
497 | 578 | { |
498 | 579 | "cell_type": "markdown", |
499 | | - "id": "d3731e2c", |
| 580 | + "id": "96f5c67e", |
500 | 581 | "metadata": { |
501 | 582 | "editable": true |
502 | 583 | }, |
|
527 | 608 | }, |
528 | 609 | { |
529 | 610 | "cell_type": "markdown", |
530 | | - "id": "6c7c5340", |
| 611 | + "id": "d1bc28ba", |
531 | 612 | "metadata": { |
532 | 613 | "editable": true |
533 | 614 | }, |
|
0 commit comments