-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Headerset #563
base: develop
Are you sure you want to change the base?
Headerset #563
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this, this is a good start!
This can be done more simply, and more globally by leveraging the information we get from authenticating the user against the Studio server. Please ask any follow up questions about the directions I have suggested.
Also, please make these commits independently of the commits from your other PR.
ricecooker/managers/tree.py
Outdated
) | ||
if response.status_code == 200: | ||
return | ||
url_response = config.SESSION.post(config.get_upload_url(), json=data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why this code block has been dedented? It wasn't in your other PR.
@@ -207,6 +223,9 @@ def make_request( | |||
retry_count = 0 | |||
max_retries = 5 | |||
request_headers = DEFAULT_HEADERS | |||
|
|||
configure_download_session(sess, user_email) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note, I think we should be setting this more globally on config.DOWNLOAD_SESSION
, not just within the the downloader utility. While it is most acute when we are downloading HTML pages, it is better we do this more broadly.
Once we have authenticated against Studio with the token, we should be able to use the user email from there, rather than requiring it be explicitly passed in.
See here where we receive the username (which is the email address) once we have authenticated with Studio: https://github.com/learningequality/ricecooker/blob/develop/ricecooker/commands.py#L90
Once we have that information in hand, we can do the required DOWNLOAD_SESSION update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, passing an email manually every time can be tedious. I will work on an approach and make sure that the code is more of Don't Repeat Yourself and improves overall authentication and session management. But, I would like to ask if there any specific requirements or constraints for setting the download session that I need to take care of?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @Divyanshi750 - I'm sorry, I don't understand what isn't clear here - I have outlined which session and needs to be updated, and also shown where in the code this should happen.
Improved session configuration by adding global email setup after Studio authentication. Changes made:
This makes the download session configuration more consistent and eliminates the need |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In spite of tests being present, the code is full of issues that are evident from a code read through.
This has the appearance of being generated with a lot of LLM assistance, and not sufficient critical reflection of the generated code.
It is also not fixing the issue in the life cycle of a SushiChef class being executed. The header needs to be set on the central DOWNLOAD_SESSION in the config
module - and should be done in the place where the user has already been authenticated with a token with the Studio server.
@@ -136,3 +136,4 @@ video_cache_py3.sqlite | |||
cache.sqlite | |||
|
|||
chefdata/ | |||
audio_cache.sqlite |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has been added in develop from your other PR - you are still committing the file itself here as well too, so that needs to be removed from the commit history.
@@ -0,0 +1,11 @@ | |||
[[source]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still needs to be removed, as does the file below.
@@ -26,6 +26,7 @@ | |||
"console_scripts": [ | |||
"corrections = ricecooker.utils.corrections:correctionsmain", | |||
"jiro = ricecooker.cli:main", | |||
"ricecooker = ricecooker.cli:main", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't need to be added.
) | ||
|
||
|
||
def test_performance_overhead(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this test doing?
baseline_time = timeit.timeit(baseline_request, number=100) | ||
custom_email_time = timeit.timeit(custom_email_request, number=100) | ||
|
||
assert custom_email_time - baseline_time < 0.01 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we making assertions in the module level code? This looks like it was generated by an LLM and unthinkingly pasted in here. Please focus only on testing the code you have added around updating the user agent.
ricecooker/utils/__init__.py
Outdated
|
||
|
||
def configure_download_session(session, user_email=None): | ||
import ricecooker |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this being imported twice?
ricecooker/utils/downloader.py
Outdated
@@ -28,6 +28,8 @@ | |||
from ricecooker.utils.html import download_file | |||
from ricecooker.utils.html import replace_links | |||
from ricecooker.utils.zip import create_predictable_zip | |||
import logging |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why has logging been added here?
ricecooker/commands.py
Outdated
Studio authentication method that also configures the download session. | ||
""" | ||
try: | ||
auth_response = authenticate_studio(username, password) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this function making a recursive call here?
ricecooker/utils/downloader.py
Outdated
@@ -48,6 +50,14 @@ | |||
} | |||
|
|||
|
|||
|
|||
from . import configure_download_session | |||
def download_file(url, clear_cookies=False, headers=None, timeout=60, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this new method for?
@@ -197,7 +207,7 @@ def read( | |||
|
|||
|
|||
def make_request( | |||
url, clear_cookies=False, headers=None, timeout=60, session=None, *args, **kwargs | |||
url, clear_cookies=False, headers=None, timeout=60, session=None, user_email=None, *args, **kwargs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to pass this in, we instead copy the user agent from the globally configured DOWNLOAD_SESSION that should have had its user agent headers set properly. So far this PR is not doing this though, so is missing the main point of this issue.
6a90db4
to
ffec3c8
Compare
The majority of comments here are still unaddressed - the main approach needs to be tweaked to focus on the config.DOWNLOAD_SESSION and to be triggered when the tree manager authenticates to Studio. |
This has been force pushed - but there are still conflicts to be resolved, and the principle feedback given is still unaddressed. |
Hi @rtibbles , I was thinking to approach the problem in this way: |
We just need to update the session on the core config object, and have that update when authentication has happened with Studio, so we have the registered email address. That's all that is needed here. |
Summary
This pull request enhances the User-Agent header generation mechanism in the Ricecooker download utility. The main changes made are:
configure_download_session()
function to generate User-Agent headersmake_request()
function to support optional user email identificationReferences
Fixes: #483
Reviewer guidance
Added test cases to: