html5rdf is a pure-python library for parsing HTML to DOMFragment objects for
the use in RDFLib. html5rdf is a fork of html5lib-modern.
It is designed to conform to the WHATWG HTML specification, as is implemented by
all major web browsers.
htm5lib-modern is designed as a drop-in replacement for html5lib that exposes a
new html5lib module without Python 2 support and without the legacy dependencies
on six, and webencodings.