-
Notifications
You must be signed in to change notification settings - Fork 0
FAQ
Arie Joe edited this page Jun 3, 2026
·
1 revision
PyPI distribution:
wayback-machine-downloader
Import package:
wayback_downloaderCLI command:
wayback-machine-downloader
Under:
./websites/<backup-name>/
That is the default cleanup behavior. Use --keep if you want to preserve
.cdx.json and .downloaded.txt.
Yes:
python -m wayback_downloader --all-timestamps https://example.comYes:
python -m wayback_downloader --snapshot-at 20130101000000 https://example.comThis builds a best-effort composite snapshot from captures at or before that timestamp.
No runtime dependencies are currently declared.
No. The current unit suite is intentionally offline and uses fake transports.
Not by itself. --local rewrites saved files. Pair it with --page-requisites
to fetch additional page assets.
--recursive-subdomains:
- mirrors first-party subdomains of the base domain
--cross-host:
- allows page-requisite asset discovery to queue arbitrary other hosts
Because archived URLs can contain:
- query strings
- invalid Windows characters
- repeated percent encoding
- directory-like paths
The downloader normalizes them into stable local files that can be resumed and rewritten consistently.
Yes. The repository already includes GitHub Actions for:
- CI
- TestPyPI
- PyPI release publishing