indie-stats

indie-stats is a Python open source project that will gather mf2 data for IndieWeb domains and generate stats.

Generates a domains.json file for each domain with metadata for the site and it's status - this is needed because quite a few of them are 404 or timeouts.

Each domain is stored as flat-files:

bear.im
- bear.im.json -- meta-data for the domain
- 20140921T001159_bear.im.json -- data for the domain at the time it was polled, one per poll

Routes

https://indie-stats.com/domain to claim or mark as excluded your domain.
https://indie-stats.com/domain?domain=bear.im to view domain details
https://indie-stats.com/login
https://indie-stats.com/logout
https://indie-stats.com/api/v1/domains
- https://indie-stats.com/api/v1/domains/<domain>

Features

Domain owners can login and claim and/or exclude their domains from being processed
Crawl IndieWeb domains and store
- mf2 data
- html content
- request and response headers
Maintain metadata for domains showing their current status
Domain list is seeded from chat-names

Working On

Storing request and response headers

Generate stats

For each domain crawled the domain, timestamp and data will be passed to a master "cruncher" that will then loop thru a list of stat generating apps. The resulting json blob from this generating app will be added along with namespace and timestamp to the stat history for the domain. Stat items to calculate:

have a header for auth
use indieauth as their auth item
have a header for webmention
have a header for micropub
have a h-card
have a h-entry

Stat retrieval

Add an endpoint to allow for a call to be made for a domain and a date range and the response will be the json blob of stats.