tidy is a library which you can use to cleanup malformed HTML before parsing it for microformats for typical IndieWeb use-cases; it’s available in most PHP setups, but you may have to explicitly enable it in your php ini or phprc (depending on your webhost).

