diff options
Diffstat (limited to '')
-rw-r--r-- | textproc/py-html-text/pkg-descr | 7 |
1 files changed, 7 insertions, 0 deletions
diff --git a/textproc/py-html-text/pkg-descr b/textproc/py-html-text/pkg-descr new file mode 100644 index 000000000000..3ded2dd0baf6 --- /dev/null +++ b/textproc/py-html-text/pkg-descr @@ -0,0 +1,7 @@ +Extract text from HTML. + +html_text is a library for extracting text from HTML, with a few handy +features: +- It removes leading and trailing whitespace +- It handles HTML entities +- It uses lxml for parsing |