wiktionary_dump.py

View code on Github

Functions for downloading Wiktionary dumps.

scribe_data.cli.download.wiktionary_dump.download_wiktionary_dumps(output_dir: Path = PosixPath('scribe_data_wiktionary_dumps_export'), language_isos: list[str] = ['en'], dump_snapshot: str | None = 'latest') Path | None[source]

Download the latest Wiktionary pages-articles dump based on passed language isos.

Parameters:
output_dirPath, optional, default=DEFAULT_WIKTIONARY_DUMP_EXPORT_DIR

Directory to save the dump. Defaults to DEFAULT_WIKTIONARY_DUMP_EXPORT_DIR.

language_isosList[str], optional, default=[‘en’]

A list of ISO-2 codes for desired Wiktionary dumps.

dump_snapshotstr, optional, default=’latest’

The Wiktionary dump snapshot to be downloaded.

Returns:
Path

Path to the downloaded file, or None if aborted/failed.