PubSubHubbub, Feeds, and the Feed API
By Peter Dickman, Engineering Manager
Google has supported the
PubSubHubbbub (PuSH) protocol
since its introduction in 2009. Earlier this year we completely rewrote our PuSH hub
implementation, both to make it more resilient and to considerably enhance its capacity and
throughput. Our improved PuSH hub means we can expose feeds more efficiently, coherently and
consistently, from a robust secure
access
point. Using the PuSH protocol, servers can subscribe to an almost arbitrarily large
number of feeds and receive updates as they occur.
In contrast, the Feed API allows you to download any specific public Atom or RSS feed using
only JavaScript, enabling easy mashups of feeds with your own content and other APIs. We are
planning some improvements to the Feed API, as part of our ongoing infrastructure work.
We encourage you to consider PuSH as a means of accessing feeds in bulk. To support that,
we’re clarifying our practices around bots interacting with Google’s PuSH system: we encourage
providers of feed systems and related tools to connect their automated systems for feed
acquisition to our PuSH hub (or other hubs in the PuSH ecosystem). The PuSH hub is designed to
be accessed by bots and it’s tuned for large-scale reading from the PuSH endpoints. We have
safeguards against abuse, but legitimate users of the access points should see generous
limits, with few restrictions, speed bumps or barriers. Similarly, we encourage publishers to
submit their feeds to a public PuSH hub, if they don’t want to implement their own.
Google directly hosts many feed producers (e.g. Blogger is one of the largest feed sources on
the web) and is a feed consumer too (e.g. many webmasters use feeds to tell our Search system
about changes on their sites). Our PuSH hub offers easy access to hundreds of millions of
Google-hosted feeds, as well as hundreds of millions of other feeds available via the PuSH
ecosystem and through active polling.
The
announcement
of v0.4 of the PuSH specification advances our goal of strengthening infrastructure support
for feed handling. We’ve worked with
Superfeedr and others on the
new specification and look forward to it being widely adopted.
Peter Dickman spends his days herding cats for the Search Infrastructure group in
Zurich. He divides his spare time between helping government bodies understand cloud computing
and systematically evaluating the products of Switzerland’s chocolatiers.
Posted by Scott Knaster,
Editor