-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Showing
3 changed files
with
32 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
Exercise 9.1 | ||
a. The linear regression approach is not very effective. Instead of steadily decreasing as the periods go by, the RMSE jumps up and down, seemingly at random, with the validation set unsurprisingly consistently having a larger RMSE than the training set. | ||
b. L2 loss, which the linear regression approach uses, doesn't penalize misclassifications well for probability outputs. Unlike L2 Loss, LogLoss penalizes probability-based output misclassifications in a way that corresponds with the severity of the misclassification much better than L2 Loss. While L2 Loss doesn't penalize differently enough for misclassifications at different probabilities, LogLoss does. | ||
c. Logistic regression is much more effective for this example than linear regression was. While linear regression increased or decreased the RMSE unpredictably, logistic regression gives us much more predictable, ideal RMSE behavior: a steadily decreasing curve more or less flattening out as the periods increase. | ||
d. linear_classifier = train_linear_classifier_model( | ||
learning_rate=0.00001, | ||
steps=20000, | ||
batch_size=1000, | ||
training_examples=training_examples, | ||
training_targets=training_targets, | ||
validation_examples=validation_examples, | ||
validation_targets=validation_targets) | ||
|
||
AUC on the validation set: 0.81 | ||
Accuracy on the validation set: 0.79 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
Exercise 9.2 | ||
a. Regularizing decreases the number of parameters involved in the model. This makes the model simpler, helping to avoid overfitting and improving efficiency. | ||
b. L1 regularization encourages the weights on parameters to go to 0. This decreases the number of parameters that actually play a role in the model, therefore increasing sparsity. | ||
c. LogLoss: 0.25 | ||
Model Size: 553 | ||
Gamma: 0.85 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
Exercise 9.3 | ||
a. 3, 16 unit hidden layers: Essentially no change in accuracy. Slightly worse at .88008. | ||
1, 16 unit hidden layer: Also close to no change in accuracy. Slightly better though, at .88528. | ||
2 layers, 32 units each: Worse. Accuracy is now 0.87472. | ||
2 layers, 64 units each: Worse. Accuracy is now 0.87788. | ||
2 layers, 16 units, mse: Slightly better. Accuracy: 0.8856. | ||
2 layers, 16 units, binary_crossentropy, tanh: Worse, Accuracy at .87684. | ||
|
||
Substituting mse for binary_crossentropy and doing 1 hidden layer with 16 units resulted in an increase in accuracy. I'd assume that 2 hidden layers slightly overfits compared to 1, so that's why 1 performs better. I don't quite understand why mse performs better than binary_crossentropy on this example; since it's categorization instead of a regression problem, binary_crossentropy should be performing better. | ||
|
||
|