check.py

View code on Github

Functions for checking data exports against their contracts.

scribe_data.cli.contracts.check.check_contracts(output_dir: str | None = None) None[source]

Check data contracts in the specified or default output directory to ensure data completeness.

Parameters:
output_dirOptional[str], optional

Directory containing exported contract data. If None, uses the default DEFAULT_JSON_EXPORT_DIR.

scribe_data.cli.contracts.check.check_contract_data_completeness(export_dir: Path, language: str | None = None) dict[str, dict[str, list[str]]][source]

Validate exported data contracts against their metadata requirements.

This function checks if the exported data for a given language (or all languages) contains all the required forms specified in the data contracts.

Parameters:
export_dirPath

Directory containing exported contract data.

languageOptional[str], optional

Specific language to check. If None, checks all languages in the directory.

Returns:
Dict[str, Dict[str, List[str]]]

A nested dictionary containing missing forms by language and data type.

{
‘Language Name’: {

‘nouns’: [‘missing_noun_form1’, ‘missing_noun_form2’], ‘verbs’: [‘missing_verb_form1’]

}

}

The above is the expected structure.

scribe_data.cli.contracts.check.print_missing_forms(missing_forms: dict[str, dict[str, list[str]]]) None[source]

Print missing forms from data contracts.

Parameters:
missing_formsDict[str, Dict[str, List[str]]]

A dictionary of missing forms, structured as returned by check_contract_data_completeness().