content addressing

From IndieWeb
Jump to: navigation, search


content addressing is a way of looking up pages or files by hashes of their contents rather than the URL of their origin server.

Use in existing web standards

CSP allows specification of hashes eg

 Content-Security-Policy: default-src 'self';
                        script-src 'self' https://example.com 'sha256-base64 encoded hash'

Sub-resource Integrity does too

 Conformant user agents must support the SHA-256, SHA-384 and SHA-512 
 cryptographic hash functions for use as part of a request’s integrity 
 metadata and may support additional hash functions.
 User agents should refuse to support known-weak hashing functions
 like MD5 or SHA-1 and should restrict supported hashing functions 
 to those known to be collision-resistant.

service worker requests have

 Request.integrity Read only
   Contains the subresource integrity value of the request 
   (e.g., sha256- BpfBw7ivV8q2jLiT13fxDYAe2tJllusRSZ273h2nFSE=).

Proposed new standards

Some discussion at IETF - see this presentation and drafts:

Note the use of this header:

  x-object-meta-sha1base36: 1d91dx0894wjewukeyxu56os5uhx4ph

Strange de facto standards

HTTP Extensions for a Content-Addressable Web (2001) no longer on the web except as a mailing list archive

Magnet URIs - wikipedia article is more current than the site or draft

Possible integrations

If servers use a prefixed hash in the above format as an ETag, that could enable incremental usage of content hashes.

See Also