Google WebSub is DoSing Me
Running a feed reader is an interesting business. One of the more interesting problems I have seen is this “DoS” from Google’s WebSub Hub.
This is a graph of WebSub pings for a single feed, The Chromium Blog. This blog appears to be powered by Blogger.
This isn’t a ton of traffic, the peaks hover around one request every 10 s. However, it adds up when the bursts run for days.
The number of entries in the ping varies and is bimodal. It is usually around 20 (Blogger appears to keep 25 items in the feed) but often 1 or 2. Occasionally it is somewhere in the middle. In general WebSub hubs will remove already-delivered entries from the feed. However, in this case all the items are previously seen. (Maybe they are modified? I don’t have metrics for this yet.)
Most of the bursts start because a new entry is posted. Some bursts start with no clear trigger. I have yet to see a post that doesn’t have a burst associated. These bursts then continue until the WebSub subscription expires.
FeedMail is subscribed to hundreds of WebSub feeds via pubsubhubbub.appspot.com and dozens of Blogger feeds. However only this one has this problem.
Looking at the subscription debugging page I can see that there is often a 504 error in the past day. Even if this did happen, (which seems unlikely but is definitely possible) why would the retries continue constantly? Especially since the same page says that feedmail.org has 0% errors! According to our monitoring the slowest response to these pings in the last 60 days was less than 4 s and the P99.9 is under 350 ms. There is a chance that our CDN, or a bug, prevents the request from completing. But this seems unlikely seeing that every other feed works correctly.
Is someone just updating posts every couple of seconds for a week after posting? I think this is unlikely. If this was the case why would the pings stop when the subscription was refreshed? If it was actual updates I would expect that the new subscription immediately started seeing the same traffic.
The strangest part is that it is just this one feed. It has been going on for a long time and happens every new post (and sometimes between posts). I don’t know if this blog has some special handling in the hub (YouTube does, so it isn’t out of the question).
Anyway, I don’t really have a point here, it is really just a curiosity. It does waste a bit of resources, but it isn’t going to take us down. (At least not until every Blogger feed starts doing this 😅).