Adding a folder "handeul_codes" #7

handeulson · 2023-12-03T17:04:17Z

In requirements.txt, I only added "scikit-learn". I created a simple machine learning code using "svm".

Test size was 0.2, and the accuracy was 47.22 %.

I also added numpy files.zip which I used for.

ColinMoldenhauer

Looks pretty good to me, I would suggest some minor changes:

remove the data from your pull request

optional:

use TreeClassifPreprocessedDataset instead of your loop solution

ColinMoldenhauer · 2023-12-04T09:46:30Z

handeul_codes/model_SVM(Support_Vector_Machine).py

+labels = np.array(labels)
+
+# Split the data into training and testing sets
+X_train, X_test, y_train, y_test = train_test_split(data, labels, test_size=0.2, random_state=42)


potentially add a seed so that the experiment is repeatable

ColinMoldenhauer · 2023-12-04T09:50:56Z

handeul_codes/1123_delete_nan_samples_nanmean_B2.zip

I personally would try to avoid adding data to git, because it will slow down git.

As mentioned in Chris' pull request, please make sure the data is also not in your git history anymore by using git rebase (see this link)

ColinMoldenhauer · 2023-12-04T09:59:22Z

handeul_codes/model_SVM(Support_Vector_Machine).py

+# Specify data folder direction
+data_dir = '/Users/handerson/Desktop/Codes/DatSciEO-main/data/1123_delete_nan_samples_nanmean_B2/'
+
+for num, species in enumerate(tree_species):


This probably works well, you could also use os.listdir(data_dir) to get all files directly, so that you don't need a while True: block.

Alternatively, you could make use of the new dataset class TreeClassifPreprocessedDataset, which already implements the logic of mapping files to class labels. So you could probably use something like

ds = TreeClassifPreprocessedDataset(data_dir) for data_, label_ in ds: data.append(data_) labels.append(label_)

handeulson added 3 commits December 3, 2023 17:53

Create requirements.txt

fb16039

Add files via upload

6adcab5

Add files via upload

8432737

ColinMoldenhauer requested changes Dec 4, 2023

View reviewed changes

handeulson and others added 2 commits December 16, 2023 12:06

Delete handeul_codes/1123_delete_nan_samples_nanmean_B2.zip

86d2d41

Version 2

bbc0662

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding a folder "handeul_codes" #7

Adding a folder "handeul_codes" #7

handeulson commented Dec 3, 2023

ColinMoldenhauer left a comment •

edited by JiangyuanWangYi

Loading

ColinMoldenhauer Dec 4, 2023

ColinMoldenhauer Dec 4, 2023

ColinMoldenhauer Dec 15, 2023

ColinMoldenhauer Dec 4, 2023

Adding a folder "handeul_codes" #7

Are you sure you want to change the base?

Adding a folder "handeul_codes" #7

Conversation

handeulson commented Dec 3, 2023

ColinMoldenhauer left a comment • edited by JiangyuanWangYi Loading

Choose a reason for hiding this comment

ColinMoldenhauer Dec 4, 2023

Choose a reason for hiding this comment

ColinMoldenhauer Dec 4, 2023

Choose a reason for hiding this comment

ColinMoldenhauer Dec 15, 2023

Choose a reason for hiding this comment

ColinMoldenhauer Dec 4, 2023

Choose a reason for hiding this comment

ColinMoldenhauer left a comment •

edited by JiangyuanWangYi

Loading