https://raw.githubusercontent.com/scribe-org/Scribe-Data/main/.github/resources/images/ScribeDataLogo.png

Wikidata and Wiktionary language data extraction

Installation

Scribe-Data is available for installation via pip:

# Using uv (recommended - fast, Rust-based installer):
uv pip install scribe-data

# Or using pip:
pip install scribe-data

The latest development version can further be installed the source code on GitHub:

# With uv (recommended):
uv sync --all-extras  # install all dependencies
source .venv/bin/activate  # activate venv (macOS/Linux)
# .venv\Scripts\activate  # activate venv (Windows)

# Or with pip:
python -m venv .venv  # create virtual environment
source .venv/bin/activate  # activate venv (macOS/Linux)
# .venv\Scripts\activate  # activate venv (Windows)
pip install -e .

To utilize the Scribe-Data CLI, you can execute variations of the following command in your terminal:

scribe-data -h  # view the cli options
scribe-data [command] [arguments]

Available Commands

list (l): Enumerate available languages, data types and their combinations.
get (g): Retrieve data from Wikidata for specified languages and data types.
total (t): Display the total available data for given languages and data types.
convert (c): Transform data returned by Scribe-Data into different file formats.

Contents

Scribe-Data
- cli/
- utils.py
- check/
- load/
- unicode/
- wikidata/

Project Indices

Index