Discovery is a variety of methods for finding content, websites, communities, or people to follow on the web including search, directories, recommendation engines, tags, or other serendipitous methods.
or alternately a more developer-centric definition:
- 1 User-centric definition
- 2 Developer-centric definition
- 2.1 Profile Info
- 2.2 Separate Contact Page
- 2.3 Updates
- 2.4 Post Information
- 2.5 POSSE copies
- 2.6 Original post
- 2.7 Real Time Updates
- 2.8 Public Keys
- 2.9 Help or About
- 2.10 Legacy Discovery
- 2.11 See Also
- Share Your OPML - A service by Dave Winer for sharing your OPML file and aggregating against others to help uncover popular websites. See also: http://scripting.com/2016/10/12/areYouReadyToShareYourOpml.html The resultant combined data set can be found at http://feedbase.io/
- Micro.blog offers a few interesting (mostly hand-curated) methods of discovery:
- Discover timeline
- Emoji in Discover Using emoji (aka tagmoji) to collect related content for discovery purposes.
- Curating the Micro.blog Discover Timeline
- Where Discover Doesn't Help
- MicroMonday was launched on 2018-01-08 as a means of helping people on the service discover interesting people to follow
- MicroMonday Podcast is a microcast geared toward short introductions of community members by way of interviews.
- Indieweb.xyz is a Webmention-based directory that provides the ability to aggregate content based on a variety of stubs or tags.
- Microcast.club is a webring-based directory of microcasts, or short-form podcasts, created to help you discover new and interesting microcasts!
Curated Lists for Discovery
- Kicks Condor has a hand-curated list of interesting websites that he updates (roughly) monthly on his site at https://www.kickscondor.com/hrefhunt/.
- Warren Ellis published a list of recommended sites to follow via RSS on 2019-10-14 Blog Diet: A Starter List For Your RSS Reader
- Jeremy Felt published Five RSS Feeds I Followed Today on 2019-11-21 indicating:
I followed several new to me feeds today and then decided—why not share? There may be no other way to rediscover the social network that is blogging.
- Chris Aldrich has a following page of people he's subscribed to (including names, avatars, and short descriptions along with subscribeable OPML files to allow others to quickly follow or sample those people too.
- Colin Walker has a directory on his site of all the people who have Webmentioned his content (including an OPML file)
- Additional details here: https://colinwalker.blog/improving-the-webmentions-directory/
Finding new blogs is a big problem so, if you like this blog you may also appreciate the following people who have all interacted with the site via webmentions
- see also blogroll
- Pocket - best of feature
- WordPress.com/discover - A daily selection of the best content published on WordPress, collected for you by humans who love to read.
- https://belong.io - a website by Andy Baio that surfaces relevant links from his Twitter community
- Twitter has a widely used #FollowFriday or #FF hashtag that people use to Tweet about friends, colleagues, and others that they find interesting to follow, generally with a short statement about why.
- Serendeputy is a small, independently run discovery service similar to Nuzzel. It describes itself as "a personal newsfeed engine. It reads the open web and then organizes and scores it for you. It learns what you like and helps you find something interesting to read.But, here’s what’s different. Unlike your favorite search engines and social networks, Serendeputy is entirely transparent, putting you in control." Paid accounts are available to help support the developer.
- Feedly provides an AI named Leo for aiding in filtering and discovering content in one's feed. While not completely granular, the system allows modifying inputs to improve finding content without some of the issues that may come with only having access to an algorithmic feed
- Inoreader began providing a sort by magic and popularity indicators in their feed reader in early 2020.
IndieWebCamp Related Sessions
- Related to definition one:
- who to follow
- recommendation engine
- related reading
- indie map
- "I love finding personal blogs I've never seen before and learning from terrific blog posts by terrific people. I wish it was easier to find such blogs on a daily basis 💁♀️" @ambrwlsn90 February 8, 2019
- related reading
On the IndieWeb, you are your URL and your URL is you. With a well-designed IndieWeb site, as humans we can discover all the information you choose to share about yourself.
This page is for documenting and brainstorming ways to discover that same information automatically by following a combination of semantic markup and algorithms.
Typical IndieWeb sites have profile and often contact information right there on the home page.
Best publishing practice: mark it all up with hCard, especially your hyperlinks to other profiles with class="u-url url" and rel="me". This will create a representative hCard which can then be used by other sites. Use this tool to check how your hCard gets parsed.
Examples in the wild:
Best parsing practice: when given an indieweb URL that represents a person, parse it for a representative hCard and use that hCard for information about that person. Details:
Separate Contact Page
Some IndieWeb sites have contact information on a separate page from the home page, but linked from the home page. While such links may be obvious to humans, e.g. with text labels like "Contact" or "About", it's not at all obvious to parsers where to find more information.
Examples of sites with separate contact/about pages:
- http://adactio.com/ - contact information at: http://adactio.com/contact/
- http://voxpelli.com/ - about information at: http://voxpelli.com/about (linked to with rel-me from front page)
- https://kartikprabhu.com/ - about information at: https://kartikprabhu.com/about (linked with rel-me from homepage)
- https://jacky.wtf/ - contact information at https://jacky.wtf/contact/
- With a new 'rel' value, e.g. rel="contact-info", parsers could discover such links, follow them, and then parse the destination for a representative hCard.
Most IndieWeb sites have updates on their home page, a stream of updates as it were, for example:
Best parsing practice: parse the given URL for hAtom (and preferably microformats2 h-entry).
Others have a simple introduction/contact page as their home page, and provide updates at another URL, for example:
- ... old adactio.com
- ... any current examples? Maybe folks that have only an IndieWeb contact home page and then link to a separate page for their blog?
- With a new 'rel' value, e.g. rel="updates", parsers could discover links to separate "journal", "updates", "notes", "stream" pages and then parse those for hAtom and h-entry.
To discover information about a post on a post permalink page:
- Parse the page for h-entry and retrieve post name, summary, URL, contents, and author accordingly.
Summary: (see main authorship page for full algorithm)
- If the author is also an h-card, parse that for more information about the author such as name, photo, logo, URL, etc.
- If there is no p-author, then look for a rel-author link.
- Follow it, and retrieve the representative h-card from the destination for author information
Assuming a post permalink page, see:
- If the h-entry has a
dt-publishedproperty, use that.
- This is the only thing that's been implemented. Here are some fallbacks:
- else the following sources may be used to infer it (brainstorming)
- Check the URL path. If the start (? or somewhere near the start?) of the URL path of the permalink is:
/YYYY/MM/DD/= publication date ISO YYYY-MM-DD
- Sites using this: aaronparecki.com
/YYYY/DDD/= publication date ISO Ordinal YYYY-DDD
- Sites using this: tantek.com
To find the POSSE copies of an original post, on that original post's permalink page:
- Look for rel=syndication links
- Treat their destinations as POSSE'd copies of the original post
To find an original post from a POSSE'd copy, on that POSSE'd copy's permalink page:
- Follow original-post-discovery
Real Time Updates
... rel=hub ... on link to PuSH hub for your updates.
For an improved Salmon key-discovery flow (that is, not using DRY-violating XRD files and web-breaking email-like identifiers), we need to expose public keys somehow.
- http://microformats.org/wiki/existing-rel-values#HTML5_link_type_extensions (specifically the
- Examples of how people are exposing their public key in the wild
Public key exchange was also discussed at the IndieWeb dinner on November 1st, 2013, in the context of this in-person using QR codes. Potentially the QR code could simply be a URL pointing to the user's website, where a public key could be extracted from microformat encoding.
Help or About
HTML has the rel="help" value, but it's not clear that it conveys the "about" kind of resources you link to.
- Tom Morris has documented the IndieWeb support on his site on a separate page.
There are bunch of legacy discovery methods that are worth documenting, in case you want to interoperate with legacy systems that depend on them.
... rel=alternate ... type=...+xml ...
We should deprecate separate feeds as:
- Separate feeds are not technically necessary (hAtom or h-entry on visible HTML works just fine).
- Separate feeds violate the DRY principle
- Related to the developer-centric definition
- Here's a thread of good personal IndieWeb sites and commentary on them too! Good way to seed an IndieWeb search engine (then crawl links from there!) https://twitter.com/zachleat/status/1245781964214480900
- "Step 1. Reply to this tweet with a link to your personal website.
Step 2. Find a reply to this tweet and say something nice about their website.
Let’s get some positivity going, y’all ❤️" @zachleat April 2, 2020
- "Step 1. Reply to this tweet with a link to your personal website.