-
Notifications
You must be signed in to change notification settings - Fork 0
Preprocess within cv #45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…rresponding to small90 and small10
|
I'm encountering some errors: I could (temporarily) resolve this one by commenting it out that import. I don't have a file called plotting_functions.py in my trainer directory. As some extra info (I don't think it matters), I ran the code with the following arguments: |
|
Are you aware of those warnings? This happens for multiple of the variables. According to the documentation, fixing this issue can make the code run significantly faster, and thereby potentially making it run on other computers than the DCCN cluster only? It seems relatively simple to fix. |
|
Conceptually, the code seems alright to me (although I didn't read it very thoroughly), but I'm running into multiple errors for different combinations of arguments. For example, when I run Btw, in order to arrive at this point I had hard coded the train and validation datasets to just 3000 / 300 lines only, because otherwise it took a very long time. I think it would be a good idea if you ran your code with all combinations of command line parameters (e.g., using both optimization and submission, and including/excluding the valid and test data). It would be important that the code can run through before we merge into master. |
|
I also don't really understand this line:
Because you're not providing the train data here, but preprocess_confidence does use the train data. |
|
I pushed something to fix the warnings related to the indexing |
|
Concerning the error: this actually arises in a part of main, that I have not manipulated and this error arises also in master at the moment. I think we have a block of code double. The one starting on lines 207 and 222 on master. Thus we have to fix this anyway. I can fix it in master and this branch to avoid confusion. Concerning testing of all combinations of command line parameters. |
|
@johannadevos Thank you for your feedback, but those were all minor code things. What I am really wondering about is the actual output. Why does branch preprocess_within_cv score so high on validation data, while master never exceeds 0.95? I don't think I made major changes, but somehow the output is very different. |
Hi all, this is what I have so far concerning preprocessing within cross-validation issue (right now we are doing it before therefore biasing our performance within cv)
The code runs, but I think something is wrong with it, as the score on the validation data is much higher than on the test data.
I would be grateful for any help.