posse-post-discovery

From IndieWeb
Jump to: navigation, search

POSSE-Post-Discovery is a way to find the canonical version of a post when a syndicated copy does not contain a link to the original.

Contents

Motivation

Some silos make it difficult or unpleasant to include a permalink or citation in POSSE'd copies.

  • Instagram lacks a posting API, so content must be PESOS copied (e.g., via OwnYourGram). Users can manually go back and add a caption containing the permalink, but it is not possible to do automatically. [1]
  • Twitter:
    • Links/citations eat into the length limit.
    • Difficult to indicate that some links are for shared content and some are original-post links. Note: Permashortcitations mitigate this.
    • Some users prefer to leave them off for aesthetic reasons (e.g., User:werd.io seems to have discontinued them?)

Primary use case: allow Bridgy to backfeed comments from POSSE and PESOS copies. Bridgy will have the syndicated URL and the author's root domain (from a silo profile page).

This would also support original-post-discovery on older silo content that was backfilled to a personal domain.

Algorithm

Discovery algorithm:

  • Fetch the author's url (e.g., from their silo profile)
  • Look for <link rel="feed" type="text/html"> (h-feed#rel_feed). If found, consider this their unfiltered h-feed. If not, use the current page.
  • For each h-entry on the current page... (Ideally this page would be an h-feed, but if not just use top-level h-entries).
    • Follow its u-url permalink to find its post permalink page.
    • Look for rel-syndication and u-syndication links on the current page. If the syndicated post is among them, then this is the original.
    • Note: this is HTTP-request-intensive. To save time on the next lookup, store other discovered relationships for future lookup.

Bridgy

As of 2014-04-26, Bridgy uses posse-post-discovery to backfeed replies/reposts/likes of link-less syndicated content.

In the example below, @kyle_wm shared an article from @dangillmor.

kwm-linkless-tweet.png

Bridgy would typically consider the linked article at wp.me to be the original, but after searching kylewm.com, it also found the original repost on kylewm.com, and sent back likes, replies, and retweets.

kwm-original-post-with-ppd-comments.png

Set up your site

If you publish rel-syndication links to syndicated content, your site may already support posse-post-discovery (Yay, standards!).

Publishers need to do the following things:

  • Make sure your silo profile URL points to your indieweb site.
  • Make sure your indieweb site has h-entry markup on your posts
  • Optionally wrap an h-feed around the entries you want posse-post-discovery to see
    • If this is not the same page as your profile URL, include a <link rel="feed"> that points to the page with your primary h-feed.
  • Include rel-syndication and/or u-syndication links on post permalink pages.

Tradeoffs

The community is currently split on whether/how often to include permashortlinks/permashortcitations in POSSE posts. We haven't counted explicitly, but anecdotally it seems like maybe half of us do regularly, and half of us don't.

Pro:

  • Backlinks can improve readers' experience
    • Author's site will usually provide richer content than the silo copy (better formatting, indieweb replies, replies from other silos)
  • Particularly on social media sites like Facebook that allow backlinks to be included unobtrusively.
  • Backlinks/citations make original post discovery much easier. Indieweb replies to the syndicated content will have a better chance of finding the original and replying directly.

Con:

  • Often confusing or irritating to silo users who see the POSSE post.
  • Distracts from actual content.
  • Individual personal preference and aesthetic choices.

People who include backlinks:

  • Tantek Çelik
  • Kyle Mahan
    • FB: always
    • Twitter: if the content is truncated
  • Aaron Parecki
    • POSSE TO FB: always include backlinks as a "See Original" link that appears next to "Like", not in the content of the post
    • POSSE to Twitter: includes permashortlink if content is truncated, but often crafts shorter versions of POSSE tweets to avoid abrupt truncating

People who do not include backlinks:

Open Questions

  • What's the best way to discover unfiltered h-feed from an author's homepage? Current solution of following <link rel="feed"> is pretty ad hoc. Need more examples in the wild.
    • Tantek suggested rel=alternate as an ... alternate.

Historical Alternatives

These were other proposals for supporting original-post-discovery. For the most part they were simpler from the discover's point of view, but required new protocols or hacked existing protocols.

Query endpoint

Create a new link-rel type and a new endpoint, perhaps <link rel="original-post-discovery" href="..."> where domains could provide a query endpoint. The query endpoint would take one GET parameter ?syndication=[URL] and redirect to the original post (if found). See Proof of concept implementation.

  • Pros
    • Simple implementation from Bridgy's point-of-view.
    • Endpoint can be cached per domain, and from then on, Bridgy can construct the query URL as a webmention target without additional HTTP requests.
    • Allows sites to "claim" their POSSEd content even when it doesn't have a permalink
  • Cons
    • Requires addition to existing standards
    • and for every user to individually provide this new endpoint

Send webmentions with the silo url as target

Bridgy could deliver webmentions to the root domain if no permalink is found, filling in target= with the syndicated, rather than original, url.

Sites could reject these mentions, or choose to look up the post permalink URL based on the syndicated URL. This is not entirely unlike receiving a mention to a shortened link.

  • Pros
    • Similarly simple implementation in Bridgy
    • Abuses existing standard instead of creating a new one (or is this a con?)
  • Cons
    • Requires support on client-side
    • Less explicit than the query endpoint discovery mechanism
    • Bridgy would be sending a lot of doomed webmentions to domains that don't support receiving target=silo-url

Mechanism for sites to inform Bridgy

When creating new entries, sites would POST to a well-known Bridgy endpoint with parameters ?original=...&syndication=... Bridgy would store these relationships so it could query them later.

  • Like "query endpoint", this is also a new protocol.
  • Unlike "query endpoint", would require Bridgy to store more data

"Hidden" permalinks

Facebook provides for a custom action that can be included with each post. This has been used to discretely include a POSSE_to_Facebook#See_Original link at the bottom of POSSEd posts.

Is it possible to include this metadata in other silos? Some suggestions:

  • encode permalink in Twitter "location" (lat/long).
  • include watermark in Instagram photos as a watermark/barcode/steganographically encoded image.
    • unlikelihood of bridgy OCR'ing images
    • would require a custom instgram-wrapper application to include the watermark. (although at that point, the wrapper just include the post permalink in the caption)

See Also

Personal tools
Namespaces
Variants
Actions
Recent & Upcoming
Resources
Toolbox