FAQ

What is the package name?

PyPI distribution:

wayback-machine-downloader

Import package:

wayback_downloader

CLI command:

wayback-machine-downloader

Where are files written by default?

Under:

./websites/<backup-name>/

Why do state files disappear after a successful run?

That is the default cleanup behavior. Use --keep if you want to preserve .cdx.json and .downloaded.txt.

Can it download every archived version of a file?

Yes:

python -m wayback_downloader --all-timestamps https://example.com

Can it reconstruct a site at one specific date?

Yes:

python -m wayback_downloader --snapshot-at 20130101000000 https://example.com

This builds a best-effort composite snapshot from captures at or before that timestamp.

Does it require third-party runtime dependencies?

No runtime dependencies are currently declared.

Does the test suite hit `web.archive.org`?

No. The current unit suite is intentionally offline and uses fake transports.

Does `--local` download missing assets?

Not by itself. --local rewrites saved files. Pair it with --page-requisites to fetch additional page assets.

What is the difference between `--recursive-subdomains` and `--cross-host`?

--recursive-subdomains:

mirrors first-party subdomains of the base domain

--cross-host:

allows page-requisite asset discovery to queue arbitrary other hosts

Why are some filenames sanitized or hashed?

Because archived URLs can contain:

query strings
invalid Windows characters
repeated percent encoding
directory-like paths

The downloader normalizes them into stable local files that can be resumed and rewritten consistently.

Can I publish releases automatically?

Yes. The repository already includes GitHub Actions for:

CI
TestPyPI
PyPI release publishing

See Automation and Release.

Main Repository | PyPI Package | Issue Tracker

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FAQ

FAQ

What is the package name?

Where are files written by default?

Why do state files disappear after a successful run?

Can it download every archived version of a file?

Can it reconstruct a site at one specific date?

Does it require third-party runtime dependencies?

Does the test suite hit `web.archive.org`?

Does `--local` download missing assets?

What is the difference between `--recursive-subdomains` and `--cross-host`?

Why are some filenames sanitized or hashed?

Can I publish releases automatically?

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Usage

Internals

Clone this wiki locally

FAQ

FAQ

What is the package name?

Where are files written by default?

Why do state files disappear after a successful run?

Can it download every archived version of a file?

Can it reconstruct a site at one specific date?

Does it require third-party runtime dependencies?

Does the test suite hit web.archive.org?

Does --local download missing assets?

What is the difference between --recursive-subdomains and --cross-host?

Why are some filenames sanitized or hashed?

Can I publish releases automatically?

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Usage

Internals

Clone this wiki locally

Does the test suite hit `web.archive.org`?

Does `--local` download missing assets?

What is the difference between `--recursive-subdomains` and `--cross-host`?