process_unicode.py

View code on Github

Module for processing Unicode based corpuses for autocompletion generation.

scribe_data.unicode.process_unicode.gen_emoji_lexicon(language: str, emojis_per_keyword: int) dict[str, list][source]

Generate a dictionary of keywords (keys) and emoji unicode(s) associated with them (values).

Parameters:
languagestr

The language keywords are being generated for.

emojis_per_keywordint

The limit for number of emoji keywords that should be generated per keyword.

Returns:
dict

Keywords dictionary for emoji keywords-to-unicode are saved locally or uploaded to Scribe apps.