Cloudflare's Weird Age Header.

Posted

I noticed something weird today:

% curl -I https://blog.benjojo.co.uk/rss.xml
HTTP/2 200
age: 99788
cache-control: public, max-age=1800
cf-cache-status: HIT
date: Fri, 18 Feb 2022 17:41:40 GMT
server: cloudflare
...

Do you see the oddity? Age is greater than Cache-Control’s max-age. According to RFC2616 that means that this resource is already expired. Let’s look at the “Summary of age calculation algorithm”.

apparent_age = max(0, response_time - date_value)
corrected_received_age = max(apparent_age, age_value)
response_delay = response_time - request_time
corrected_initial_age = corrected_received_age + response_delay
resident_time = now - response_time
current_age = corrected_initial_age + resident_time

Let’s fill in some variables and evaluate…

age_value = 101626
date_value = "2022-02-18 17:41:40"
now = "2022-02-18 17:41:41"
request_time = "2022-02-18 17:41:39"
response_time = "2022-02-18 17:41:41"

apparent_age = max(0, "2022-02-18 17:41:41" - "2022-02-18 17:41:40") # 1
corrected_received_age = max(1, 101626) # 101626
response_delay = "2022-02-18 17:41:41" - "2022-02-18 17:41:39" # 2
corrected_initial_age = 101626 + 2 # 101628
resident_time = "2022-02-18 17:41:41" - "2022-02-18 17:41:41" # 0
current_age = 101628 + 0 # 101628

I’ll save you the math for calculating freshness but it boils down to current_age <= max_age AKA 101628 <= 1800 == false # STALE.

This means that Cloudflare is serving resources that are already stale! Now this isn’t much of a problem, the customer may have configured Cloudflare to cache for an extended period of time, ignoring the original headers. However in that case Cloudflare should probably “spoof” the headers to reflect reality. In this case this the best option is likely updating the Cache-Control header to suggest that the response still has a little life left in it. Or just set max-age=0 to indicate that it needs to be revalidated, I think this option would preserve the current behaviour while being less confusing.

One explanation is that the origin is down. The Age header is over 28h old, so it does seem unlikely that this was an intentional cache period. In this case they may be using something like a s-stale-if-error extension to indicate that the cache may use expired entries if the origin is down.

Another possible explanation is that Cloudflare revalidated the cache entry, however in this case the Age header should be reset to the time of validation.

I don’t think this is a major problem, the only downside I can think of is that it will lead to clients constantly contacting Cloudflare to revalidate, which may not have been the original intention. However it certainly did pique my curiosity.