-
-
Notifications
You must be signed in to change notification settings - Fork 135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tiny bug : oml_id must be an int in load_openml_task #686
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #686 +/- ##
=======================================
Coverage 68.15% 68.16%
=======================================
Files 54 54
Lines 6730 6741 +11
=======================================
+ Hits 4587 4595 +8
- Misses 2143 2146 +3 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the pointer. However, I wanted to understand why sometimes there was an integer and sometimes a string. Modified/moved the logic a little bit to catch these mistakes earlier in the process to give a faster and more informative error message.
btw. you are using a "too new" version of OpenML. The one defined in requirements.txt
still supports identifiers passed as a str, which is why this wasn't raised before :) but it's good future proofing anyway, I want to upgrade someday.
try: | ||
oml_id = int(oml_id_str) | ||
except ValueError: | ||
raise ValueError( | ||
f"Could not convert OpenML id {oml_id_str!r} in {benchmark!r} to integer." | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason it was sometimes str
and sometimes not was because when specified as an openml benchmark (openml/{s,t}/{id} it was passed as str. But when loaded from a file the ids were conventionally defined as integers.
for task in tasks: | ||
if task["openml_task_id"] is not None and not isinstance( | ||
task["openml_task_id"], int | ||
): | ||
raise TypeError("OpenML task id for task {task.name!r} must be integer.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While tasks defined in a yaml file should be integer. Nothing stops a user from defining a str instead. So here we can give an early warning.
Check if oml_id is a valid integer before trying to load the task. This sometimes does not happen by default. (Only get_suite supports string ids.)
I am not sure if this behavior is different in the new OpenML API.