Content extraction

 Content extraction  are techniques and tools to get the main/structured content from web pages.

tools
if you have experience with any of these tools, please add your experience oh, content extraction service
 * https://github.com/n1k0/readable-proxy – based on readability.js, the basis of
 * readability-lxml
 * limited
 * breadability
 * limited
 * newspaper3k
 * memory issues?
 * https://mercury.postlight.com/web-parser/

indieweb/silo specific tools

 * XRay
 * granary