Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Template modification date determines value of Last-Modified header #471

Open
jwadolowski opened this issue Oct 30, 2018 · 3 comments
Open

Comments

@jwadolowski
Copy link

jwadolowski commented Oct 30, 2018

Bug description

I recently discovered that knot.x copies Last-Modified header it receives from template repository and sends it back to the HTTP client. Such behavior can cause caching issues, especially when rather static templates are combined with dynamic data and the response goes through shared cache (CDN, forward proxy server, etc).

knot.x version: 1.4.0

Steps to reproduce

I captured 2 scenarios that outline the problem.

1st example

knotx-304

  • the flow looks as on the diagram I attached above
  • Handlebars template changes few times a year, so template repository caches HTML markup on disk
  • data stored in the API (the one knot.x fetches dynamic data from) is super dynamic and produces personalized results
  • upon first request knot.x fetches the template and receives the following response
HTTP/1.1 200 OK
Date: Tue, 30 Oct 2018 18:52:41 GMT
Server: Apache
X-Content-Type-Options: nosniff
Last-Modified: Tue, 30 Oct 2018 12:59:26 GMT
Accept-Ranges: bytes
Vary: Accept-Encoding
Content-Encoding: gzip
X-Request-ID: W9ioeeae7@ELj-qnzR2fcAAAAkA
X-Frame-Options: SAMEORIGIN
X-XSS-Protection: 1; mode=block
Referrer-Policy: no-referrer-when-downgrade
Cache-Control: private
Content-Length: 2772
Connection: close
Content-Type: text/html; charset=UTF-8
  • knot.x sends a call to the API to obtain data required to populate template placeholders
  • as soon as the response is ready knot.x returns:
HTTP/1.1 200 OK
Cache-Control: private
Content-Type: text/html; charset=UTF-8
Date: Tue, 30 Oct 2018 18:55:39 GMT
Last-Modified: Tue, 30 Oct 2018 12:59:26 GMT
Vary: Accept-Encoding
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN
X-Request-ID: W9ipK@-2Z@DZlr@S@yRslAAAAQs
X-XSS-Protection: 1; mode=block
Connection: close
Content-Length: 7080
  • please note that response is not cacheable (because of Cache-Control: private) and contains the same Last-Modified header knot.x got from template repository
  • response goes to the CDN, but it won't cache it as the origin said it's not cacheable
  • 2nd request comes in, this time this is a conditional GET with If-Modified-Since header (to be honest I have no idea why Chrome decided to send such request, perhaps Last-Modified presence was enough and Cache-Control: private doesn't have any meaning in this context)
  • CDN passes it through to the origin, the same steps are executed as in the first case (template fetch, API call, page assembly by knot.x)
  • the response reaches CDN which sees that it contains Last-Modified header. At the same time it remembered that the request had If-Modified-Since header. Simple date comparison takes place and it turns out that nothing has changed, so CDN thinks it's ok to reply with 304 to save some bandwidth (304 doesn't have the body)

2nd example

  • as in the 1st example, static template gets combined with dynamic data, but this time responses produced by knot.x are cacheable
  • as soon as the response reaches CDN it'll be cached according the the value of Cache-Control, Surrogate-Control, Edge-Control or any other header that tells CDN how long it can keep it in the cache
  • at some point in time TTL expires and given resource has to be fetched from the origin one more time
  • to optimize round trips CDNs very often send conditional GET whenever someone asked for an object which just expired in the cache. In other words CDN knows original Last-Modified header value and sends GET with If-Modified-Since: <date_from_last_modified_header>. knot.x does its magic and returns Last-Modified it got from the template call.
  • if the template hasn't changed, but the dynamic dataset has, we will end up with 304 from the origin, so updates won't be cached in CDN, hence end users won't see any changes

Expected behavior

My first thought was that knot.x should do one of the following things:

  • stop relying on Last-Modified it gets from template repository
  • re-set Last-Modified if Handlebars markup got evaluated and placeholders were filled in with dynamic data
  • stop sending Last-Modified header completely

After a bit of thinking I realized that in some cases Last-Modified header presence could be needed (i.e. template that renders single object off of some identifier that doesn't change often). At the same time re-using whatever came from template repository is definitely not expected.

As a rule of thumb I'd say that whenever knot.x injects data into Handlebars template it should remove Last-Modified header and possibly re-set its value. Ideally the 2nd part should be somehow configurable.

Screenshots

N/A

Additional context

N/A

@malaskowski
Copy link
Member

Hi @jwadolowski , you may configure what headers are returned by Knot.x Server instance and what are not by overriding allowedResponseHeaders in the conf/includes/server.conf. With that you can achieve the expected behavior you mentioned:

stop sending Last-Modified header completely
(just remove Last-Modified entry in your Knot.x instance configuration).

The default behavior you suggested:

re-set Last-Modified if Handlebars markup got evaluated and placeholders were filled in with dynamic data

looks good to me. It could quite nice improvement.

Let us know it that helped.

@jwadolowski
Copy link
Author

jwadolowski commented Nov 7, 2018

Thanks @Skejven. allowedResponseHeaders will do the trick, however there are 2 downsides of this approach:

  • there are situations where traffic goes via knot.x, but nothing is being done Handlebars-wise, so Last-Modified is actually expected. This is a global change, so we may break other things that rely on such header
  • allowedResponseHeaders is a whitelist, which on one hand is good, but on the other hand makes "allow all the headers except X and Y" policy quite hard to implement. This is not a big deal for most users, but I can imagine some could complain about that.

Workaround I used was implemented directly at CDN level.

Regarding re-setting Last-Modified - I started thinking about that and it seems to be more complex than I originally assumed. Here's a few use cases:

  • templates w/o Handlebars snippets - it's ok to reuse Last-Modified from the template itself
  • template w/ Handlebars snippet(s), but the dynamic part doesn't change often and is always the same for everyone (no personalization or anything like that). If both template call and API call(s) end with Last-Modified header knot.x could pick the newest date and send it back to the client. If the header doesn't appear in a consistent manner then knot.x should probably calculate its own value
  • dynamic dataset which changes frequently and/or returns personalized results (user location/cookie/auth-driven API response). If that's the case ETag is probably more appropriate.

In general there's probably no universal solution.

@malaskowski
Copy link
Member

@jwadolowski , there might be a solution that Knot.x can technically support. It is a bit simplified scenario in comparison to your proposition:

  • Knot.x requests repository with Last-Modified header.
  • If there is 304, additional call is made without the header for the template.
  • Then, Knot.x can check if this template contains dynamic snippet.
  • If there is no dynamic snippet, then 304 is returned by Knot.x, otherwise standard template processing is done.

As you wrote, this is not universal solution. It is custom behaviour, not everyone might want to use it. In my opinion this is good candidate for Knot.x 2.0 when it will be easy to insert some custom handler with that logic.

@jwadolowski jwadolowski changed the title Template modification date determines value of Last-Modification header Template modification date determines value of Last-Modified header Apr 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants