You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
a) Why is train.loc[train_indices, col] rather than train.loc[:, col]?
Vaild data and test data may also be nan.
b) Why is train.fillna rather than train[col].fillna?
It may fillnan for other columns.
I think the correct expression should be train[col].fillna(train.loc[:, col].mean(), inplace=True).
I'm not sure whether I am correct. I would appreciate it if you can reply. Thank you very much!
The text was updated successfully, but these errors were encountered:
Hi,
I noticed it, as well.
This original code fills all Nan values across the dataframe with the mean from the first continuous column
X.fillna(X.loc[train_indices, col].mean(), inplace=True)
I would recommend using this code to fill all Nans with the mean of the corresponding column:
X[col].fillna(X.loc[train_indices, col].mean(), inplace=True)
Hello, I'm a beginner interested in Tabular Learning. Your superb paper, SAINT, impresses me a lot. But I've had some problems learning your code.
For
saint/old_version/data.py
Line 233 in e288e84
saint/data_openml.py
Line 89 in e288e84
a) Why is train.loc[train_indices, col] rather than train.loc[:, col]?
Vaild data and test data may also be nan.
b) Why is train.fillna rather than train[col].fillna?
It may fillnan for other columns.
I think the correct expression should be train[col].fillna(train.loc[:, col].mean(), inplace=True).
I'm not sure whether I am correct. I would appreciate it if you can reply. Thank you very much!
The text was updated successfully, but these errors were encountered: