get_forms.py
Get forms from Wikidata SPARQL query files.
- scribe_data.check.check_missing_forms.get_forms.parse_sparql_files() dict[source]
Read and parse all SPARQL query files to extract form information.
- Returns:
- dict
Accumulated forms for each language and lexical category. Format: {language: {lexical_category: [forms]}}.
Notes
Recursively searches through WIKIDATA_QUERIES_ALL_DATA_DIR directory for .sparql files and accumulates all form information.
- scribe_data.check.check_missing_forms.get_forms.parse_sparql_query(query_text: str) dict[source]
Parse a SPARQL query to extract lexical categories and features.
- Parameters:
- query_textstr
Content of the SPARQL query file.
- Returns:
- dict
Dictionary containing parsed information. Format: {language: {lexical_category: [forms]}}.
Notes
Extracts: - Language QID - Lexical category QID - Grammatical features from OPTIONAL blocks