-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error page optimisations #299
base: main
Are you sure you want to change the base?
Conversation
I will have a go at reviewing this, but probably worth a second pair of eyes from a 'proper' back-end dev - @zerolab maybe you if you have time? Re caching the 404 page: could you check with Liv if there are any SEO implications? |
This is still very draft (hence the status). It's not really reviewable or shippable yet. I need to do some more thinking and testing to make sure the SEO wise, I highly doubt bots pay much attention to the content. Sending the correct mime type ought to handle most things. But I'll check |
This saves queries and processing time, if the HTML isn't ever going to be rendered, and simple text would be enough.
This should reduce the impact on missing pages being crawled.
949a272
to
489ae30
Compare
|
||
@requires_csrf_token | ||
@vary_on_headers("Accept") | ||
@cache_control(max_age=900) # 15 minutes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this maybe set s_maxage
too?
torchbox.com/tbx/core/utils/cache.py
Line 23 in 03bc0e5
"s_maxage": s_maxage, |
return True | ||
|
||
|
||
@requires_csrf_token |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do these need csrf? Doesn't that skip the cache completely?
Description of Changes Made
This PR makes 2 notable changes:
Serve simpler 404 pages when possible
Our 404 page is a fancy HTML page, comprised of multiple templates, and requiring a number of DB queries to create (not many queries, granted). If a person in a browser loads a page, we want to show them this "fancy" 404 page for a better user experience. However, if the request shouldn't return HTML (eg it's a missing static file) or user never asked for HTML, we shouldn't spend the time creating a fancy 404 page if it's never going to be viewed.
Instead, when possible, we show a simplified HTML page, which just contains text. This requires much fewer resources to generate, and is quicker to serve.
Cache 404 pages
This one might be controversial. 😬
If a page returns a 404, chances are it'll still be a 404 in 10 minutes time, or even longer. Therefore, it's probably something which can be cached to reduce system load.
According to RFC2616, 404s should not be cached. However, for our use case, I think it's worth it. The TTL is intentionally shorter than it probably could be, but this could be increased in future.
In Wagtail, a request will always do a database query. Potentially multiple depending on how much of the path does exist. Therefore, missing pages can result in higher than expected usage, and won't be cached by an edge cache. Worse still, because the 404 pages usually shown are fancy HTML versions, they may do queries in themselves (for eg navigation), making 404s more expensive still.
By caching the 404, we reduce the impact on users viewing it in future, especially useful if a site is being crawled, as many frontend caches will normalise URLs before caching (ours sure does).
If a 404 has been cached, and a page is created in its place, Wagtail's existing frontend caching will purge the 404s cache during publishing.
Related reading:
How to Test
This can be tested in the browser, by confirming the correct 404 is shown. The unit tests give a few useful examples. Similarly,
curl
can be used to manually exercise the header.Note: If no
Accept
header is passed, Django assumes*/*
.MR Checklist
Unit tests
Documentation
Browser testing
Data protection
Accessibility
Sustainability
Pattern library
I've upstreamed some helper methods which would make this kind of content negotiation much simpler in future: django/django#18415