Skip to content

Comments

assignment_1#1

Open
Yutong2002 wants to merge 1 commit intomainfrom
assignment-1
Open

assignment_1#1
Yutong2002 wants to merge 1 commit intomainfrom
assignment-1

Conversation

@Yutong2002
Copy link
Owner

What changes are you trying to make? (e.g. Adding or removing code, refactoring existing code, adding reports)

I added a full KNN classification workflow on the Wine dataset. Including: 1) Selected predictor variables and standardized them using StandardScaler. 2) Split the standardized data into training and test sets using train_test_split. 3) Initialized a KNeighborsClassifier and used GridSearchCV to conduct Cross-validation. Trained a final KNN model using the best n_neighbors value and evaluated its performance on the test set using accuracy_score.

What did you learn from the changes you have made?

I learned how to build a basic but complete supervised KNN model in scikit-learn starting from raw data.

Was there another approach you were thinking about making? If so, what approach(es) were you thinking of?

At the step of split the data into a training and testing set, I was thinking importing wine_df for the X value. However, I shifted to standarized_dataframe since the data has to be standarized before computation.

Were there any challenges? If so, what issue(s) did you face? How did you overcome it?

N/A

How were these changes tested?

All scripts were tested in the Jupyter notebook on VS code; Correct results were generated.

A reference to a related issue in your repository (if applicable)

N/A

Checklist

  • [ x ] I can confirm that my changes are working as intended

Copy link

@PatelVishakh PatelVishakh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assignment 1: Complete. Great work!
Suggested Changes:

Q1)III) The type of variable is categorical. In a data science setting, this question is asking whether the variable is continuous or categorial(integer, ordinal are other options).

Q2)II) We did not standardize the response variable Class because it is categorical and standardization is not meaningful for non-numeric labels, also we do not standardize the response variable, because that is the quantity of interest and we would have to rescale our predictions make them usable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants