Skip to content

Add 'wl create --stream <url>' to torrentify a remote origin#10

Merged
iksnerd merged 1 commit into
mainfrom
feat/create-stream
Jun 13, 2026
Merged

Add 'wl create --stream <url>' to torrentify a remote origin#10
iksnerd merged 1 commit into
mainfrom
feat/create-stream

Conversation

@iksnerd

@iksnerd iksnerd commented Jun 13, 2026

Copy link
Copy Markdown
Owner

The breadth track from the dataset strategy (follow-up to #9): register a remote artifact — a Hugging Face / Kaggle / any http(s) URL — without downloading it to disk. The body is streamed once to compute the hybrid v1+v2 hashes, and the origin URL is carried as a BEP 19 web seed so clients fetch from it. This is what lets us register many artifacts cheaply (no storage per item, no persistent seeder).

Changes

  • torrent: refactored hashing so the core loop runs from an io.Reader + known size (hashReaderHybrid); split out populateSingleFileInfo and finalize so the on-disk and streaming paths share assembly. New CreateStream(opts, r, size, name). No behavior change to Create — golden vectors still pass.
  • wl create: --stream <url> (mutually exclusive with a path arg). Fetches the URL, requires a known Content-Length, streams to CreateStream, auto-adds the origin as a web seed. Name derived from the URL path or --name.

Verification

  • TestCreateStream asserts the streamed hashes equal the Transmission-verified golden vectors (proving streaming-hash == file-hash) and that the origin is recorded as a web seed; rejects bad size/name.
  • End-to-end against a local HTTP server: wl create --stream http://…/data.bin produced v1 cc1614c7… / v2 db4ca38f… with no local copy, and transmission-show lists the web seed.
  • go build / go vet / gofmt clean; full go test ./... green.

Constraints / follow-up

  • Requires the origin to report Content-Length (chunked responses are rejected rather than buffered).
  • Single-file only for now (one URL → one file). Multi-file remote sets are a later follow-up.
  • wl get consumer-side web-seed download remains the other tracked follow-up.

🤖 Generated with Claude Code

The breadth track from the dataset strategy: register a remote artifact
(Hugging Face / Kaggle / any http(s) URL) without downloading it to disk. The
body is streamed once to compute the hybrid v1+v2 hashes, and the origin URL is
carried as a BEP 19 web seed so clients can fetch from it.

- torrent: refactor hashing so the core loop works from an io.Reader + known
  size (hashReaderHybrid); split out populateSingleFileInfo and finalize so the
  on-disk and streaming paths share assembly. New CreateStream(opts, r, size,
  name). No behavior change to Create — golden vectors still pass.
- wl create: --stream <url> (mutually exclusive with a path arg). Fetches the
  URL, requires a known Content-Length, streams to CreateStream, auto-adds the
  origin as a web seed. Name derived from the URL path or --name.

Tests: TestCreateStream asserts the streamed hashes equal the Transmission-
verified golden vectors (streaming-hash == file-hash) and that the origin is
recorded as a web seed; bad size/name rejected. Verified end-to-end against a
local HTTP server — wl create --stream produced v1 cc1614c7/v2 db4ca38f with no
local copy, and transmission-show lists the web seed.
@iksnerd iksnerd merged commit 34905a7 into main Jun 13, 2026
1 check passed
@iksnerd iksnerd deleted the feat/create-stream branch June 13, 2026 13:23
iksnerd added a commit that referenced this pull request Jun 21, 2026
Add 'wl create --stream <url>' to torrentify a remote origin
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant