wiktionary_dump.py
Functions for downloading Wiktionary dumps.
- scribe_data.cli.download.wiktionary_dump.download_wiktionary_dumps(output_dir: Path = PosixPath('scribe_data_wiktionary_dumps_export'), language_isos: list[str] = ['en'], dump_snapshot: str | None = 'latest') Path | None[source]
Download the latest Wiktionary pages-articles dump based on passed language isos.
- Parameters:
- output_dirPath, optional, default=DEFAULT_WIKTIONARY_DUMP_EXPORT_DIR
Directory to save the dump. Defaults to DEFAULT_WIKTIONARY_DUMP_EXPORT_DIR.
- language_isosList[str], optional, default=[‘en’]
A list of ISO-2 codes for desired Wiktionary dumps.
- dump_snapshotstr, optional, default=’latest’
The Wiktionary dump snapshot to be downloaded.
- Returns:
- Path
Path to the downloaded file, or None if aborted/failed.