-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Showing
2 changed files
with
83 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
Exercise 11.1 | ||
a. Yes. Since every model's viability is entirely driven by amount of data available, and said data's peculiarities, a linear model will certainly perform better on some data than a deep neural network. In particular, data which exhibits linear dependencies will likely be more suited to a linear model, if only due to the increased efficiency of a linear model. | ||
b. Yes. The linear model had an accuracy of 0.78612 on the training set, and 0.78216 on the testing set. The deep neural net model had superior accuracy in both categories, at 0.88 and 0.8 respectively. | ||
c. Embeddings don't seem to do much at all for sentiment analysis tasks. The resulting accuracy after adding an embedding was barely different from the linear model, at 0.78588 on the training set and 0.78252 on the testing set. This doesn't seem to make much intuitive sense to me, since I'd assume that embeddings would group similar words well, which would be quite helpful in sentiment analysis, but the data seems to indicate otherwise. | ||
d. Worst and waste have very similar embeddings. This makes sense because the movies that tend to feel the worst aren't the ones that are so bad that they're actually redeeming in their badness, but rather the ones that feel as if they were just a total waste of time. So it makes sense that the movies which are described as the "worst" are the ones which are also described as a "waste." | ||
e. | ||
|
||
terms_embedding_column = tf.feature_column.embedding_column(terms_feature_column, dimension=10) | ||
feature_columns = [ terms_embedding_column ] | ||
|
||
my_optimizer = tf.train.AdamOptimizer(learning_rate=0.15) | ||
my_optimizer = tf.contrib.estimator.clip_gradients_by_norm(my_optimizer, 5.0) | ||
|
||
classifier = tf.estimator.DNNClassifier( | ||
feature_columns=feature_columns, | ||
hidden_units=[5,5], | ||
optimizer=my_optimizer | ||
) | ||
|
||
Training set metrics: | ||
accuracy 0.94868 | ||
accuracy_baseline 0.5 | ||
auc 0.98791265 | ||
auc_precision_recall 0.98873836 | ||
average_loss 0.15073076 | ||
label/mean 0.5 | ||
loss 3.768269 | ||
precision 0.96270937 | ||
prediction/mean 0.46732095 | ||
recall 0.93352 | ||
global_step 1000 | ||
|
||
Test set metrics: | ||
accuracy 0.87188 | ||
accuracy_baseline 0.5 | ||
auc 0.94225335 | ||
auc_precision_recall 0.9421878 | ||
average_loss 0.37323484 | ||
label/mean 0.5 | ||
loss 9.330871 | ||
precision 0.9028512 | ||
prediction/mean 0.4441259 | ||
recall 0.83344 | ||
global_step 1000 | ||
|
||
f. I skipped this section. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
Exercise 11.2 | ||
a. Examining the histogram for race, we can note that ~85% of the examples are White. This is indicative of data skew, as we would expect the examples to be more balanced racially. Similar ideas can be applied to the gender histogram; instead of the expected 50/50 female-male split expected, about ~67% of the examples are male. | ||
b. With Binning | X-Axis set to gender and Color By and Label By set to income_bracket, it became extremely obvious that the percentage of males that belong to the > 50k income bracket is much higher than the corresponding percentage for females. This indicates that high-income females are underrepresented in our data. | ||
c. | ||
d. The model seems to perform significantly better for White and Asian-Pac-Islander than for Black or Amer-Indian-Eskimo, with the former two having noticeably higher Precision and Recall than the latter two. It's also worth noting that the False Positive Rate is much higher for the former two than the latter two; this indicates that the when the reality is a <50k income, the model is much more likely to categorize White and A-P-I as in the upper income bracket than Black and A-I-E. | ||
|
||
White | ||
|
||
1893-1475 | ||
779-8823 | ||
|
||
Precision Recall False Positive Rate | ||
0.7085 0.5621 0.0811 | ||
|
||
Black | ||
|
||
77-91 | ||
45-1198 | ||
|
||
Precision Recall False Positive Rate | ||
0.6311 0.4583 0.0362 | ||
|
||
Asian-Pac-Islander | ||
|
||
76-45 | ||
32-255 | ||
|
||
Precision Recall False Positive Rate | ||
0.7037 0.6281 0.1115 | ||
|
||
Amer-Indian-Eskimo | ||
|
||
8-11 | ||
5-125 | ||
|
||
Precision Recall False Positive Rate | ||
0.6154 0.4211 0.0385 |