Skip to content

wasm-runner: add parallel_map/4 for concurrent collection processing#13

Open
yevbar wants to merge 8 commits intomasterfrom
sleepy/parallel-map
Open

wasm-runner: add parallel_map/4 for concurrent collection processing#13
yevbar wants to merge 8 commits intomasterfrom
sleepy/parallel-map

Conversation

@yevbar
Copy link
Copy Markdown
Contributor

@yevbar yevbar commented Feb 28, 2026

By Sleepy

Adds WasmRunner.parallel_map/4 and parallel_map!/4 — concurrent versions of map/4 that distribute work across multiple WASM instances using Task.async_stream.

Key features:

  • Compiles the WASM module once, shares across N instances
  • Preserves input order in output
  • Concurrency auto-capped to collection size
  • Defaults to System.schedulers_online() parallel instances
  • Works with precompiled modules for zero-compilation overhead
  • Extracts start_parallel_instances/3 helper (reusable by run_concurrent/4)

When to use:

  • map/4 → small collections or fast functions (avoids instance overhead)
  • parallel_map/4 → large collections with expensive per-element work (crypto, sorting, recursive algorithms)

Tests: 13 tests covering ordering, edge cases, precompiled modules, and consistency with sequential map/4.

HDR Agent and others added 8 commits February 28, 2026 20:58
Adds the missing WasmRunner.run/2 and run!/2 functions for ergonomic
one-shot WASM execution with a keyword-list calling convention:

  {:ok, 8} = WasmRunner.run("math.wasm", add: [5, 3])
  {:ok, [8, 55]} = WasmRunner.run("math.wasm", add: [5, 3], fibonacci: [10])
  8 = WasmRunner.run!("math.wasm", add: [5, 3])

Features:
- Single call returns unwrapped value: {:ok, 8} not {:ok, [8]}
- Multiple calls return list: {:ok, [8, 55]}
- Options (wasi, cache) mixed into keyword list
- Works with precompiled modules for 53x speedup
- Auto WASI detection for Go modules
- Instance always cleaned up (even on error)

Performance characteristics (from benchmark):
- Cold run/2: ~2ms (dominated by WASM compilation)
- Precompiled run/2: ~40μs (53x faster)
- Pre-loaded call_single: ~15μs (call overhead only)
- WASM fibonacci(30) is 5x faster than pure Elixir

Includes:
- 22 tests covering single/multi calls, errors, cleanup, precompiled
- Benchmark: bench/wasm_runner_run.exs
- Performance docs in docs/PERFORMANCE_GUIDE.md
Adds Pool.call_many/2, call_many!/2, and call_many_unwrapped/2 for
executing multiple WASM function calls in a single checkout/checkin
cycle. This eliminates N-1 GenServer round-trips when batching calls.

Performance (fixtures/math.wasm, pool size 4):
- 5 calls:   1.27x faster than repeated Pool.call
- 20 calls:  1.18x faster
- 100 calls: 1.12x faster

Includes:
- 16 tests covering correctness, ordering, errors, concurrency
- Benchmark comparing call_many vs repeated call vs native Elixir
- Performance documentation in docs/POOL_CALL_MANY.md
Add WasmRunner.pipe/3 and pipe!/3 that chain WASM function calls,
feeding the output of each call as input to the next. The module is
loaded once, all calls execute on a single instance, and cleanup is
automatic.

Features:
- Output of each call becomes first arg of the next by default
- :pipe placeholder for explicit argument positioning
- Works with precompiled modules, WASI auto-detection
- Includes 18 tests covering chaining, placeholders, errors, cleanup
- Benchmark script comparing pipe vs separate run calls

Example:
  {:ok, 42} = WasmRunner.pipe("math.wasm", [
    {:add, [5, 3]},           # => 8
    {:fibonacci, []},          # => fibonacci(8) = 21
    {:multiply, [:pipe, 2]}    # => multiply(21, 2) = 42
  ])
- Add Firebird.call/4 accepting opts (timeout) forwarded to Runtime.call
- Add call_single/4 and call_single!/4 with optional timeout kwarg
- Thread :timeout through run/2, pipe/3, run_batch/3, run_concurrent/4
- Thread :timeout through benchmark/4 and benchmark_compare/5
- Add extract_call_opts/1 to cleanly separate call opts from start opts
- Add :timeout to @known_opts so run/2 keyword API recognizes it
- Add 12 tests covering timeout threading for all WasmRunner functions
- Register @wasm attribute with accumulate: true in wasm_modules
- Remove unused generate_expr/3 in wat_gen.ex (use /5 directly)
- Prefix unused module_name param in firebird.compile.ex
Add functional collection primitives to WasmRunner that load a WASM module
once, apply a function over an enumerable, and auto-cleanup the instance:

- map/4, map!/4: apply a WASM function to each element
- reduce/5, reduce!/5: fold a collection through a WASM binary op
- filter/4, filter!/4: keep elements where WASM predicate returns non-zero
- each/4: apply for side-effects only

Input normalization handles integers, tuples, and lists automatically.
Works with file paths, raw bytes, and precompiled modules.

Includes 34 tests and a Benchee benchmark (bench/wasm_collection.exs).
Add WasmRunner.parallel_map/4 and parallel_map!/4 that distribute work
across multiple WASM instances using Task.async_stream. The module is
compiled once and shared across all instances for efficiency.

Key features:
- Preserves input order in output
- Concurrency auto-capped to collection size
- Defaults to System.schedulers_online() instances
- Works with precompiled modules for zero-compilation overhead
- Extracts start_parallel_instances/3 helper (shared with run_concurrent)

Includes 13 tests covering ordering, edge cases, precompiled modules,
and consistency with sequential map/4.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant