Skip to content

I completed the 4 queestions in the homework.#1

Open
Fooroogh-RM wants to merge 3 commits intomainfrom
assignment-1
Open

I completed the 4 queestions in the homework.#1
Fooroogh-RM wants to merge 3 commits intomainfrom
assignment-1

Conversation

@Fooroogh-RM
Copy link
Owner

What changes are you trying to make? (e.g. Adding or removing code, refactoring existing code, adding reports)

What did you learn from the changes you have made?

Was there another approach you were thinking about making? If so, what approach(es) were you thinking of?

Were there any challenges? If so, what issue(s) did you face? How did you overcome it?

How were these changes tested?

A reference to a related issue in your repository (if applicable)

Checklist

  • I can confirm that my changes are working as intended

Copy link

@anjali-deshpande-hub anjali-deshpande-hub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please correct this mistake -
Question 4: The knn object will not contain the best-neighbor model.
And your final accuracy calculation is not evaluating the optimized model.

You ran GridSearchCV:

wine_grid = GridSearchCV(estimator=knn, param_grid=ParameterGrid, cv=10)
wine_grid.fit(...)


GridSearchCV internally finds the best model and stores it in:

wine_grid.best_estimator_

Your original knn = KNeighborsClassifier(n_neighbors=5) never changes.
It will always have n_neighbors=5.

So when you run:

knn.score(wine_df_test..., wine_df_test['class'])

You are evaluating the non-optimized model (k=5), not the best model.

Instead you should do:

best_knn = KNeighborsClassifier(
    n_neighbors = wine_grid.best_params_['n_neighbors']
)
# Fit on training data
best_knn.fit(X, Y)

# Evaluate on test data
accuracy = best_knn.score( ...)

Minor observations:
Call to cross_validate function is not really required since GridSearchCV does the cross validation.

@Fooroogh-RM
Copy link
Owner Author

Please correct this mistake - Question 4: The knn object will not contain the best-neighbor model. And your final accuracy calculation is not evaluating the optimized model.

You ran GridSearchCV:

wine_grid = GridSearchCV(estimator=knn, param_grid=ParameterGrid, cv=10)
wine_grid.fit(...)

GridSearchCV internally finds the best model and stores it in:

wine_grid.best_estimator_

Your original knn = KNeighborsClassifier(n_neighbors=5) never changes. It will always have n_neighbors=5.

So when you run:

knn.score(wine_df_test..., wine_df_test['class'])

You are evaluating the non-optimized model (k=5), not the best model.

Instead you should do:

best_knn = KNeighborsClassifier(
    n_neighbors = wine_grid.best_params_['n_neighbors']
)
# Fit on training data
best_knn.fit(X, Y)

# Evaluate on test data
accuracy = best_knn.score( ...)

Minor observations: Call to cross_validate function is not really required since GridSearchCV does the cross validation.

I pushed a new file.

Copy link

@anjali-deshpande-hub anjali-deshpande-hub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes work.

Just a few observations about the changes:

  1. These lines were not required:
knn.fit(X,Y)
wine_df_test['prediction'] = knn.predict(wine_df_test[wine_df_train.columns[:-1]])
#I actually did not understand the subtrain. Is it needed to find out the optimal K-value?

When you run

wine_grid.fit(wine_df_train[wine_df_train.columns[:-1]], wine_df_train['class'])

you are calling .fit() on the GridSearchCV object, which means that you’re training multiple KNN models with different K values and selecting the best one.

  1. In the future, please add comments to the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments