normalize_forms.py

View code on Github

Order QID from a missing_unique_forms based on lexeme_form_metadata.yaml.

scribe_data.check.check_missing_forms.normalize_forms.sort_qids_in_list(qids_lists: list[list[str]]) list[list[str]][source]

Sort QIDs within each sublist based on their predefined positions.

This function sorts the QIDs in each sublist of qids_lists according to their position defined in lexeme_form_metadata. QIDs not found in the metadata are placed at the end.

Parameters:
qids_listsList[List[str]]

A list of lists, where each sublist contains QIDs.

Returns:
list

A new list of lists, with QIDs in each sublist sorted by position.

scribe_data.check.check_missing_forms.normalize_forms.sort_qids_by_position(nested_qids: list[list[str]]) list[list[str]][source]

Sort lists of QIDs based on their predefined positions and sublist length.

This function sorts the sub-lists within nested_qids based on two criteria: 1. The length of the sublist (shorter lists come first). 2. The positions of the QIDs within each sublist, as defined in lexeme_form_metadata.

Parameters:
nested_qidsList[List[str]]

A list of lists, where each sublist contains QIDs.

Returns:
list

A new list of lists, sorted according to the defined criteria.