Lab09

for cs344 at Calvin U, spring 2020
mrsillydog · Apr 8, 2020 · 1241d51 · 1241d51
1 parent 98f7cca
commit 1241d51
Show file tree

Hide file tree

Showing 3 changed files with 32 additions and 0 deletions.
diff --git a/lab09/lab09_1.txt b/lab09/lab09_1.txt
@@ -0,0 +1,15 @@
+Exercise 9.1
+a. The linear regression approach is not very effective. Instead of steadily decreasing as the periods go by, the RMSE jumps up and down, seemingly at random, with the validation set unsurprisingly consistently having a larger RMSE than the training set.
+b. L2 loss, which the linear regression approach uses, doesn't penalize misclassifications well for probability outputs. Unlike L2 Loss, LogLoss penalizes probability-based output misclassifications in a way that corresponds with the severity of the misclassification much better than L2 Loss. While L2 Loss doesn't penalize differently enough for misclassifications at different probabilities, LogLoss does.
+c. Logistic regression is much more effective for this example than linear regression was. While linear regression increased or decreased the RMSE unpredictably, logistic regression gives us much more predictable, ideal RMSE behavior: a steadily decreasing curve more or less flattening out as the periods increase.
+d. linear_classifier = train_linear_classifier_model(
+    learning_rate=0.00001,
+    steps=20000,
+    batch_size=1000,
+    training_examples=training_examples,
+    training_targets=training_targets,
+    validation_examples=validation_examples,
+    validation_targets=validation_targets)
+
+AUC on the validation set: 0.81
+Accuracy on the validation set: 0.79
diff --git a/lab09/lab09_2.txt b/lab09/lab09_2.txt
@@ -0,0 +1,6 @@
+Exercise 9.2
+a. Regularizing decreases the number of parameters involved in the model. This makes the model simpler, helping to avoid overfitting and improving efficiency.
+b. L1 regularization encourages the weights on parameters to go to 0. This decreases the number of parameters that actually play a role in the model, therefore increasing sparsity.
+c. LogLoss: 0.25
+   Model Size: 553
+   Gamma: 0.85
diff --git a/lab09/lab09_3.txt b/lab09/lab09_3.txt
@@ -0,0 +1,11 @@
+Exercise 9.3
+a. 3, 16 unit hidden layers: Essentially no change in accuracy. Slightly worse at .88008.
+   1, 16 unit hidden layer: Also close to no change in accuracy. Slightly better though, at .88528.
+   2 layers, 32 units each: Worse. Accuracy is now 0.87472.
+   2 layers, 64 units each: Worse. Accuracy is now 0.87788.
+   2 layers, 16 units, mse: Slightly better. Accuracy: 0.8856.
+   2 layers, 16 units, binary_crossentropy, tanh: Worse, Accuracy at .87684.
+
+Substituting mse for binary_crossentropy and doing 1 hidden layer with 16 units resulted in an increase in accuracy. I'd assume that 2 hidden layers slightly overfits compared to 1, so that's why 1 performs better. I don't quite understand why mse performs better than binary_crossentropy on this example; since it's categorization instead of a regression problem, binary_crossentropy should be performing better.
+
+