indie-stats

From IndieWeb

indie-stats is a Python open source project that will gather mf2 data for IndieWeb domains and generate stats.

Generates a domains.json file for each domain with metadata for the site and it's status - this is needed because quite a few of them are 404 or timeouts.

Each domain is stored as flat-files:

Routes

Features

  • Domain owners can login and claim and/or exclude their domains from being processed
  • Crawl IndieWeb domains and store
    • mf2 data
    • html content
    • request and response headers
  • Maintain metadata for domains showing their current status
  • Domain list is seeded from chat-names

Working On

Storing request and response headers

Generate stats

For each domain crawled the domain, timestamp and data will be passed to a master "cruncher" that will then loop thru a list of stat generating apps. The resulting json blob from this generating app will be added along with namespace and timestamp to the stat history for the domain. Stat items to calculate:

  • have a header for auth
  • use indieauth as their auth item
  • have a header for webmention
  • have a header for micropub
  • have a h-card
  • have a h-entry

Stat retrieval

Add an endpoint to allow for a call to be made for a domain and a date range and the response will be the json blob of stats.

See Also