check_query_forms.py

View code on Github

Check the queries within Scribe-Data to make sure the accessed forms are correct.

Examples

>>> python3 src/scribe_data/check/check_query_forms.py
scribe_data.check.check_query_forms.extract_forms_from_sparql(file_path: Path) str[source]

Extract the QID from a SPARQL query file based on the provided pattern.

Parameters:
file_pathPath

The path to the SPARQL query file from which to extract forms.

Returns:
dict

The file path with form labels of the query and their respective QIDs.

Raises:
FileNotFoundError

If the specified file does not exist.

scribe_data.check.check_query_forms.extract_form_rep_label(form_text: str)[source]

Extract the representation label from an optional query form.

Parameters:
form_textstr

The text that defines the form within the query.

Returns:
str

The label of the form representation.

scribe_data.check.check_query_forms.decompose_label_features(label)[source]

Decompose a concatenated grammatical label into a list of individual features.

Parameters:
labelstr

The concatenated label string composed of several grammatical features.

Returns:
list

A list of grammatical features extracted from the label in their original order.

scribe_data.check.check_query_forms.extract_form_qids(form_text: str)[source]

Extract all QIDs from an optional query form.

Parameters:
form_textstr

The text that defines the form within the query.

Returns:
list[str]

All QIDS that make up the form.

scribe_data.check.check_query_forms.check_form_label(form_text: str)[source]

Check that the label of the form matches the representation label.

Parameters:
form_textstr

The text that defines the form within the query.

Returns:
bool

Whether the form and its current representation label match (repForm and rep).

scribe_data.check.check_query_forms.check_query_formatting(form_text: str)[source]

Check the formatting of the given SPARQL query text for common formatting issues.

Parameters:
form_textstr

The SPARQL query text to check.

Returns:
bool

Whether there are formatting errors with the query.

scribe_data.check.check_query_forms.return_correct_form_label(qids: list)[source]

Return the correct label for a lexeme form representation given the QIDs that compose it.

Parameters:
qidslist[str]

All QIDS that make up the form.

Returns:
str

The label for the representation given the QIDs..

scribe_data.check.check_query_forms.validate_forms(query_text: str) str[source]

Validate SPARQL query by checking variable order in SELECT and WHERE clauses, Presence and correct definition of forms, Form labels and representations and Query formatting.

Parameters:
query_textstr

The SPARQL query text as a string.

Returns:
str

Error message if there are any issues with the order of variables or forms, otherwise an empty string.

scribe_data.check.check_query_forms.check_docstring(query_text: str) bool[source]

Check the docstring of a SPARQL query text to ensure it follows the standard format.

Parameters:
query_textstr

The SPARQL query’s text to be checked.

Returns:
bool

True if the docstring is correctly formatted.

scribe_data.check.check_query_forms.check_forms_order(query_text)[source]

Parse and order variable names from a SPARQL query text based on a lexeme_form_metadata.json.

Parameters:
query_textstr

The SPARQL query text containing the SELECT statement with variables.

Returns:
list or bool

A sorted list of variables if the ordering differs from the original, otherwise a boolean indicating that the order matches.

scribe_data.check.check_query_forms.check_optional_qid_order(query_file: str) str[source]

Check the order of QIDs in optional statements within a SPARQL query file to ensure they align with the expected sequence based on label features.

Parameters:
query_filestr

The path to the SPARQL query file to be checked.

Returns:
str

A formatted string with details on any order mismatches in the QIDs, or an empty string if all QIDs are correctly ordered.

scribe_data.check.check_query_forms.check_query_forms() None[source]

Validate SPARQL queries in the language data directory to check for correct form QIDs and formatting.