-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
ff53412
commit 3213f95
Showing
3 changed files
with
66 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
Exercise 7.1 | ||
|
||
a. | ||
Exercise 1 | ||
cities["Begins with San and 50+ sq. miles"] = (cities["City name"].apply(lambda name: name.startswith("San")) & cities["Area square miles"].apply(lambda area: area > 50)) | ||
cities | ||
|
||
Exercise 2 | ||
cities.reindex(np.random.permutation(cities.index + 5)) | ||
|
||
b. Pandas provides data in a more understandable fashion than NumPy; it provides labeled columns and indexing on the rows, whereas it's very easy to lose track of which column represents what with NumPy. The 2-dimensional DataFrame object is also very convenient. | ||
|
||
c. Before random sampling would be one such scenario; therefore, instead of sampling at random indices or risking any organizational non-randomness, one could just randomly reorder the database and take the first X values. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
Exercise 7.2 | ||
|
||
a. Numerical data is data with number values (1, 100.4, 55). Categorical data is data with discrete labels as values ("True" or "False", "Dog" or "Cat" or "Parrot"). | ||
|
||
b. | ||
|
||
Task 1: | ||
|
||
train_model( | ||
learning_rate=0.00003, | ||
steps=500, | ||
batch_size=5 | ||
) | ||
|
||
Task 2: | ||
|
||
train_model( | ||
learning_rate=0.00003, | ||
steps=500, | ||
batch_size=5, | ||
input_feature="population" | ||
) | ||
|
||
c. Hyper-parameters are parameters which dictate how the machine learning algorithm learns, as opposed to the actual data parameters. Put a different way, hyper-parameters are developer-defined with values such that the regular parameters can be better defined by the machine learning algorithm. | ||
|
||
There is no standard tuning algorithm for hyper-parameters, since the effects of various hyper-parameters is data dependent. So the only way to truly test their validity is by testing on the data. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
Exercise 7.3 | ||
|
||
a. | ||
|
||
Task 1 | ||
|
||
california_housing_dataframe["rooms_per_person"] = california_housing_dataframe["total_rooms"] / california_housing_dataframe["population"] | ||
|
||
calibration_data = train_model( | ||
learning_rate=0.04, | ||
steps=500, | ||
batch_size=5, | ||
input_feature="rooms_per_person" | ||
) | ||
|
||
Task 2 | ||
|
||
plt.scatter(calibration_data["predictions"], calibration_data["targets"]) | ||
|
||
Task 3 | ||
|
||
california_housing_dataframe["rooms_per_person"].apply(lambda x: min(5, x)) | ||
|
||
b. Synthetic features which combine two other features allow us to explore how well a combination of multiple parameters predicts or relates to a single target parameter. | ||
|
||
c. Outliers are pieces of data with values that very much stand out from the majority of the data. The usual action with outliers is not to get rid of them entirely, but rather to set them to a more acceptable reasonable maximum or minimum value. |