summaryrefslogtreecommitdiff
path: root/textproc/py-tokenizer/pkg-descr
diff options
context:
space:
mode:
Diffstat (limited to 'textproc/py-tokenizer/pkg-descr')
-rw-r--r--textproc/py-tokenizer/pkg-descr5
1 files changed, 5 insertions, 0 deletions
diff --git a/textproc/py-tokenizer/pkg-descr b/textproc/py-tokenizer/pkg-descr
new file mode 100644
index 000000000000..c1f700edffe5
--- /dev/null
+++ b/textproc/py-tokenizer/pkg-descr
@@ -0,0 +1,5 @@
+Tokenizer: A tokenizer for Icelandic text
+
+Tokenization is a necessary first step in many natural language processing
+tasks, such as word counting, parsing, spell checking, corpus generation, and
+statistical analysis of text.