pubsub is a generic name for a protocol, system, or service that routes messages between producers and consumers. It usually refers to one of three things:
- a system that notifies readers that a publisher has published new content, e.g. PubSubHubbub.
- a system that sends push notifications to users, e.g. iOS's APN and Android's GCM.
- a category of plumbing, e.g. Google Cloud PubSub, Amazon SQS and other queue systems.
Subscribing to updates to a URL (ie the first definition above) is an extremely useful building-block for the indieweb. The rest of this page discusses that context.
- Subscriber -> Publisher: Please let me know when you update http://example.org/
- Publisher -> Subscriber: Okay, now I’ve verified you’re not spam I’ll let you know
Later, after updating http://example.org/
- Publisher -> Subscriber: I updated http://example.org, here’s its new content
Problems with this flow
- Lots of load on the publisher: Subscription management, sending updates. Too much complexity, too low a value/pain ratio for most creators to implement
- Solution: Hubs as the middle man between subscribers and publishers
- PubSubHubbub is a decentralised, hub-based approach
- Widely implemented and suitably easy for publishers to use, but difficult to test
- Lots of complexity for subscribers
- Reliance on ATOM format, semantics and DRY-violating duplication
- RSS Cloud
Ideal solution: a content-agnostic, pure-HTTP hub-based approach.
Publishers create content on the web, and let hub(s) know that it exists/when it’s updated via a POST request. Content is published with a Last-modified header.
Hubs periodically poll the content for changes (HEAD request and Last-modified header inspection) just in case they missed any updates/to enable pubsub of content published by people who for whatever reason can’t notify the hub.
Publishers publicly declare which hub manages notifications for their content via a Link header or HTML/XML element with rel=hub.
When a subscriber wants to subscribe to changes to a URL, they discover the hub as described above, and send a POST request to some endpoint (/subscribe?) with the URL they’re subscribing to and the URL they want notifications sent to.
- We need some sort of auth at this point to weed out spammers and enable the authenticity of future notifications.
- TODO: spec this stage out in more detail, take inspiration from current PuSH work
Then, when a publisher updates their content and either a) sends a notification to the hub or b) the hub polls the URL and sees the Last-modified header change, the hub sends POST requests out to all the subscribers for that URL.
- Possibly with the URL content as the POST body to prevent the Thundering Herd problem
Benefits over current solutions
- Eliminates DRY-violating duplication
- Content-agnostic, not tied to ATOM feeds but can be used with *any* content served over HTTP
- Extremely low barrier to entry for publishers — occasional hub polling of content eliminates the need to even send notifications to the hub to begin with