Signal-FTS5-Extension is a C ABI library which exposes a FTS5 tokenizer function named signal_tokenizer that: Segments UTF-8 strings into words according to Unicode standard Normalizes and removes diacritics from words Converts words to lower case When used as a custom FTS5 tokenizer this enables application to support CJK symbols in full-text search.