process_unicode.py

View code on Github

Module for processing Unicode based corpuses for autosuggestion and autocompletion generation.

scribe_data.unicode.process_unicode.gen_emoji_lexicon(language: str, emojis_per_keyword: int)[source]

Generate a dictionary of keywords (keys) and emoji unicode(s) associated with them (values).

Parameters:
languagestr (default=None)

The language keywords are being generated for.

emojis_per_keywordint (default=None)

The limit for number of emoji keywords that should be generated per keyword.

Returns:
dict

Keywords dictionary for emoji keywords-to-unicode are saved locally or uploaded to Scribe apps.