Add 'wl create --stream <url>' to torrentify a remote origin#10
Merged
Conversation
The breadth track from the dataset strategy: register a remote artifact (Hugging Face / Kaggle / any http(s) URL) without downloading it to disk. The body is streamed once to compute the hybrid v1+v2 hashes, and the origin URL is carried as a BEP 19 web seed so clients can fetch from it. - torrent: refactor hashing so the core loop works from an io.Reader + known size (hashReaderHybrid); split out populateSingleFileInfo and finalize so the on-disk and streaming paths share assembly. New CreateStream(opts, r, size, name). No behavior change to Create — golden vectors still pass. - wl create: --stream <url> (mutually exclusive with a path arg). Fetches the URL, requires a known Content-Length, streams to CreateStream, auto-adds the origin as a web seed. Name derived from the URL path or --name. Tests: TestCreateStream asserts the streamed hashes equal the Transmission- verified golden vectors (streaming-hash == file-hash) and that the origin is recorded as a web seed; bad size/name rejected. Verified end-to-end against a local HTTP server — wl create --stream produced v1 cc1614c7/v2 db4ca38f with no local copy, and transmission-show lists the web seed.
iksnerd
added a commit
that referenced
this pull request
Jun 21, 2026
Add 'wl create --stream <url>' to torrentify a remote origin
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The breadth track from the dataset strategy (follow-up to #9): register a remote artifact — a Hugging Face / Kaggle / any
http(s)URL — without downloading it to disk. The body is streamed once to compute the hybrid v1+v2 hashes, and the origin URL is carried as a BEP 19 web seed so clients fetch from it. This is what lets us register many artifacts cheaply (no storage per item, no persistent seeder).Changes
io.Reader+ known size (hashReaderHybrid); split outpopulateSingleFileInfoandfinalizeso the on-disk and streaming paths share assembly. NewCreateStream(opts, r, size, name). No behavior change toCreate— golden vectors still pass.--stream <url>(mutually exclusive with a path arg). Fetches the URL, requires a knownContent-Length, streams toCreateStream, auto-adds the origin as a web seed. Name derived from the URL path or--name.Verification
TestCreateStreamasserts the streamed hashes equal the Transmission-verified golden vectors (proving streaming-hash == file-hash) and that the origin is recorded as a web seed; rejects bad size/name.wl create --stream http://…/data.binproduced v1cc1614c7…/ v2db4ca38f…with no local copy, andtransmission-showlists the web seed.go build/go vet/gofmtclean; fullgo test ./...green.Constraints / follow-up
Content-Length(chunked responses are rejected rather than buffered).wl getconsumer-side web-seed download remains the other tracked follow-up.🤖 Generated with Claude Code