Skip to content

release: To Prod#1145

Merged
eskp merged 24 commits intoprodfrom
staging
May 6, 2026
Merged

release: To Prod#1145
eskp merged 24 commits intoprodfrom
staging

Conversation

@suisuss
Copy link
Copy Markdown

@suisuss suisuss commented May 6, 2026

Summary

Promote the following merged PRs from staging to prod:

Post-deploy verification

  • deploy-keeperhub workflow finishes green
  • curl -fsS https://app.keeperhub.com/api/health returns 200
  • Smoke-test the surfaces affected by the merged PRs above
  • Watch Sentry / logs for ~10 minutes after the rollout

Philotheephilix and others added 20 commits May 3, 2026 18:53
Adds Superfluid as a first-class KeeperHub protocol via the existing
defineProtocol() DSL. Exposes 15 declarative actions (11 writes + 4 reads)
covering streaming payments (CFA), pool-based distributions (GDA), and
SuperToken wrap/unwrap. No new plugin folder, no per-action step files,
no SDK dependency, no contracts deployed.

Surfaces:
- Constant-Flow Agreement: create-flow, update-flow, delete-flow,
  get-flow, get-net-flow
- General Distribution Agreement: create-pool, update-member-units,
  distribute, distribute-flow, connect-pool
- SuperToken: wrap, unwrap, grant-flow-operator,
  get-super-token-balance, get-underlying-token

Implementation:
- protocols/superfluid.ts -- declarative protocol with three contracts
  (cfaForwarder + gdaForwarder constant-address across all six chains
  via sameOnAllChains() helper; superToken with userSpecifiedAddress).
  Inline ABI fragments. SUPERFLUID_CHAIN_IDS, CFA_FORWARDER_ADDRESS,
  GDA_FORWARDER_ADDRESS exported as the single source of truth so
  adding/removing a chain is a one-line change.
- tests/unit/superfluid-protocol.test.ts -- 34 schema and integrity
  assertions: ABI parses, addresses match the regex, action slugs are
  unique kebab-case, every action.contract resolves, every
  action.function exists in the referenced contract's ABI.
- scripts/verify-superfluid-addresses.ts -- one-shot CLI that calls
  eth_getCode against each forwarder on each chain. Chain set is
  driven from SUPERFLUID_CHAIN_IDS joined with a local RPC metadata
  map; surfaces "Unknown chain" entries when the metadata is out of
  sync with the protocol declaration.
- scripts/e2e-superfluid-sepolia.ts -- live Sepolia lifecycle test
  (no mocks, no test framework). Walks the full 13-step flow:
  pre-flight -> approve -> wrap -> create-flow -> update-flow ->
  get-net-flow -> delete-flow -> create-pool -> update-member-units ->
  connect-pool -> distribute-flow -> read-net-flow x2 -> cleanup.
  Env-var driven; receiver-signed steps gracefully skip when no
  SUPERFLUID_E2E_RECEIVER_KEY is provided.
- specs/superfluid-protocol-plugin.md, specs/superfluid-protocol-plugin-plan.md
  capture the internal design and TDD implementation plan.

Verification:
- 34/34 schema tests pass; full unit suite (3454 tests across 185
  files) green with zero regressions.
- 13/13 live Sepolia lifecycle steps PASS via the E2E script against
  real CFA + GDA forwarders, no mocks.
- 8/12 (chain x forwarder) pairs directly verified via eth_getCode
  (Optimism, Base, Arbitrum One, Sepolia). The remaining 4 (Ethereum
  mainnet + Polygon) require keyed RPC in this environment; the
  constant address claim is supported by the 8 verified pairs and
  Superfluid's published deployment registry.

Why this matters for KeeperHub: every existing protocol in this
library is a discrete state-change primitive (lending, swapping,
staking, savings). A large category of high-value workflow automation
is fundamentally time-based -- payroll, vesting, royalty splits,
usage-based subscriptions, compute-coalition payments, insurance
premiums, DAO contributor pay. Superfluid moves time accounting
on-chain so a workflow can open a stream once and let value flow
continuously instead of waking up every N hours to issue transfers.
The workflow's job becomes deciding when to update the rate, which is
exactly what KeeperHub's triggers, retry, gas optimization, multi-step
composition, and per-org Para wallets are built for. Patterns this
unlocks (one-shot trigger -> infinite-period payment, event-driven
rate adjustment, pool-based pro-rata distribution) are not expressible
with any existing KeeperHub protocol.

Tested live on https://computepool.vercel.app where this plugin is the
streaming-payments backbone for a compute coalition that streams
fUSDCx-equivalent payments to GPU providers continuously while
inference workloads run, with member units adjusting in real time as
worker capacity changes.
Takeover of external contributor PR #1106 (feat: add Superfluid protocol).
Original commits preserved with author attribution. Follow-up fixes (KEEP-415)
will be pushed on top of this merge.

Original PR: #1106
Tracking: KEEP-415
…imeout

Follow-up to KEEP-418. KEEP-418 closed the unhandled-rejection path by
probing eth_subscribe before any block listener attaches; if the probe
fails, the workflow listener is logged-and-skipped instead of crashing
the pod. Two gaps remained:

1. defaultFallbackWss is read into NetworkConfig but never used. The
   provider manager creates one provider from defaultPrimaryWss and
   loops on the same URL forever. Chains with a permanently-bad primary
   (wrong host, dead service, no eth_subscribe support) never reach the
   configured fallback, even though chain-config repo populates one.

2. probeSubscriptionSupport awaits provider.send(eth_subscribe) with no
   externally-controllable timeout. An upstream that completes the WS
   handshake but never answers the JSON-RPC frame would block
   createProvider for the life of the socket.

This commit:

- Plumbs fallbackWssUrl through workflow-mapper -> WorkflowRegistration
  -> EventListener -> SubscribeOptions -> ChainEntry. workflow-mapper
  validates the same scheme rules as the primary; an invalid fallback
  is logged and dropped (the listener still runs on primary alone).
  configHash includes the fallback so a fallback swap restarts the
  listener.
- Refactors createProvider/reconnect to share a single openProvider
  helper that walks [primary, fallback?] in order, returning the first
  url whose factory + ready + probe all succeed. Failed providers are
  destroy()'d before moving on so sockets don't leak across the
  failover. Reconnects always start from primary so a recovered
  primary is preferred.
- Wraps the eth_subscribe probe in Promise.race with a 10s timeout
  (matches the existing heartbeat timeout).
- Adds activeWssUrl to ChainEntry and exposes both wssUrl (active) and
  fallbackWssUrl (configured) on ChainHealth so /healthz operators can
  see when failover is active.
- Tightens ensureEntry's identity check to require both URLs to match
  for a reused chain entry.

Tests: 128/128 pass (5 new fallback tests + 4 new fallback-validation
tests, plus updated health assertion). Typecheck and biome clean.
Adds two reconnect-cycle tests that lock in behaviour the production
code already implements but had no test for: that reconnect uses the
same primary-then-fallback walk as createProvider, and that running
on the fallback is not sticky once the primary recovers.

Also tightens the ChainHealth.wssUrl JSDoc to reflect that
activeWssUrl resets to null mid-reconnect, so the health surface
shows the configured primary during a reconnect window rather than
the previously-active fallback.
Live verification against chain-config/staging.json found that a
misconfigured WSS primary (DNS NXDOMAIN, ECONNREFUSED, non-WS server)
crashed the event-tracker pod via process.uncaughtException before
openProvider's try/catch could fire. The fallback URL was never tried
because the loop body's catch block was bypassed. The pod would
crashloop indefinitely on the bad URL even with a healthy fallback
configured.

Root cause: the underlying ws library emits an error event on the
WebSocket as soon as the connection attempt fails. Between
new ethers.WebSocketProvider(url) returning and ethers' _start() running
to assign onerror, there is a window with no listener attached to the
ws socket. Node EventEmitter then re-throws the error synchronously,
which lands on process.uncaughtException - treated as fatal in
index.ts.

Fix: switch defaultFactory to the WebSocketCreator overload of
ethers.WebSocketProvider so we own ws.WebSocket construction. Attach a
no-op error listener synchronously inside the creator, before returning
to ethers. The listener satisfies EventEmitter's "must have a listener"
rule. Failures still reject provider.ready via ethers' onerror once
that gets assigned, and that rejection lands in openProvider's existing
catch which walks to the fallback.

Test: tests/integration/provider-manager-bad-url.test.ts uses the real
defaultFactory against ws://127.0.0.1:1 (ECONNREFUSED is deterministic
on Linux, no DNS or remote network) and asserts that
getOrCreateProvider rejects through the awaited path AND that no
process.uncaughtException leaks during the test. Reverting the fix
makes both cases fail with the captured ECONNREFUSED errors.
- create-pool: flatten the (bool,bool) PoolConfig tuple into two top-level
  bool inputs. The DSL only special-cases tuple[] in buildInputField, so a
  bare tuple fell through to a freeform JSON text box. reshapeArgsForAbi
  rebuilds the tuple from the flat args before encoding.

- net-flow: rename the existing CFA-only action to get-cfa-net-flow and add
  a new get-net-flow backed by gdaForwarder.getNetFlow, which combines CFA
  streams and GDA pool flows. The e2e script already proved the combined
  reading is what users want for mixed CFA/GDA workflows.

- flow-rate decimals: drop decimals: 18 from int96 flowRate outputs. Flow
  rates are wei/sec rates, not token amounts. Kept on uint256
  deposit/owedDeposit/balance fields. output.decimals has no consumer in
  the codebase today (outputToAbiParameter strips it; codegen only
  annotates input decimals), but the wrong annotation is misleading.
…lity unit test

Two follow-on tests for the bad-URL crash fix in the previous commit:

1. Integration: getOrCreateProvider against wss://does-not-exist-keep434.invalid/
   exercises the dns.lookup ENOTFOUND error path. That was the original
   failure mode encountered during manual verification - the error came
   from node:dns rather than ws.ClientRequest, but the same
   EventEmitter-throw rule crashed the pod. Confirms the fix covers
   both error origins.

2. Unit: locks in the ethers invariant the fix relies on.
   ethers.WebSocketProvider(creator) does not call removeAllListeners
   on the underlying socket and does not strip EventEmitter-style
   listeners. If a future ethers upgrade changes that, this test breaks
   loudly so we do not silently lose the defensive listener and
   reintroduce the crash-on-bad-URL bug.

Both were verified to fail when the fix is reverted (the integration
case fails with leaked uncaughtException; the unit case continues to
pass since it tests ethers' own behavior, not defaultFactory's choice).
Mirrors tests/integration/protocol-wrapped-onchain.test.ts. Gated on
INTEGRATION_TEST_RPC_URL so it skips in CI without a live RPC.

Covers four assertions, focused on what the unit tests can't see:

- get-flow / get-cfa-net-flow: eth_call simulates against the real CFA
  forwarder on Sepolia and decodes the result, proving selector dispatch
  and ABI shape are correct.
- get-net-flow: same pattern against the GDA forwarder. This is the new
  action introduced earlier in this branch; the test confirms the
  contract -> function wiring on-chain.
- create-pool: builds calldata from the flat (token, admin, bool, bool)
  inputs through reshape + coerce, then estimateGas against the GDA
  forwarder. Asserts the failure mode (if any) is a business revert,
  not an encoding error -- proving the flattened tuple shape produces
  calldata the contract accepts.

Uses fUSDCx (0xb598...443B) as the SuperToken; the forwarders validate
the token argument against the host registry and revert for unknown
addresses, so a real Sepolia SuperToken is required for the read calls
to decode.
Every declared action now has at least one dispatch test that runs
against live Sepolia when INTEGRATION_TEST_RPC_URL is set:

Reads (5) -- eth_call + decode:
  get-flow, get-cfa-net-flow, get-net-flow,
  get-super-token-balance, get-underlying-token

Writes (11) -- estimateGas + assert the failure mode (if any) is a
business revert, not an ABI/encoding error:
  create-flow, update-flow, delete-flow,
  create-pool, update-member-units, distribute, distribute-flow,
  connect-pool, wrap, unwrap, grant-flow-operator

Stronger assertions:
- Read tests verify the action lands at the correct forwarder
  (CFA_FORWARDER_ADDRESS / GDA_FORWARDER_ADDRESS).
- get-underlying-token decodes the result and asserts it equals the
  expected fUSDC address, proving we read the right slot.
- Final test cross-checks the slug list against the protocol
  definition so adding a new action without a dispatch test fails CI.

buildCalldata gains an optional contractAddressOverride argument to
support userSpecifiedAddress contracts (the SuperToken family).
The encoding-error checker now runs against a single regex
(ENCODING_ERROR_RE) inside each test rather than inline assertions
in a helper, satisfying the noMisplacedAssertion lint rule.
protocols/superfluid.ts already references "/protocols/superfluid.png"
but the file was never added. Source: official Superfluid Finance
GitHub org avatar (github.com/superfluid-finance.png?size=256). Same
256x256 RGBA PNG format as the other protocol icons in this directory.
…ment (KEEP-432)

Priced workflow listings whose chain identifies a data chain (Ethereum,
Arbitrum, Polygon, BNB, Avalanche, 0G, Plasma) used to 403 with
CHAIN_MISMATCH on every paid invocation because the binding required
wf.chain to normalise to "base" or "tempo". The chain field on a listing
is overloaded — for Base-data workflows it doubles as the payment-chain
pin, but for Ethereum-data workflows it identifies where the contracts
live, not which chain payment must arrive on.

Replace normaliseChainTag (BindingChain | null) with classifyChainTag
returning {kind: "payment" | "data" | "unrecognised"}. Data-chain
listings whitelist Ethereum, Arbitrum, Avalanche, BNB, Polygon, 0G, and
Plasma (mainnet ids only, mirroring lib/rpc/rpc-config.ts) and accept
either Base x402 or Tempo MPP payment. Payment-chain pinning, the
fix-pack-3 N-1 cross-chain-proof defence, and the unrecognised-tag
defensive reject are all preserved.

Server-derived payTo and amount equality on the Base path still fire on
data-chain listings — verified by new tests. Tempo daily-spend deduction
still binds to the correct workflow price. KNOWN_DATA_CHAIN_IDS is
exported so the test suite iterates the production set directly,
eliminating the manual-sync drift the previous in-test array required.

Test coverage adds 11 new cases:
- 4 data-chain happy paths (Base + Tempo for Ethereum + parameterised loop)
- 2 security-equality assertions (PAYTO_MISMATCH, AMOUNT_MISMATCH)
- 3 data-chain edge cases (non-integer amount, case-insensitive payTo,
  WORKFLOW_NOT_PAYABLE)
- 5 strictness assertions (whitespace, leading-zero, hex, float,
  testnet-id rejection)
- 3 symmetric tempo-pin negative tests

35/35 tests pass. Lint and type-check clean.
Replaces the textured GitHub-avatar version with the Superfluid square
logo from EthGlobal's CDN -- dark-navy background with the white
Superfluid wordmark. Stylistically closer to the other protocol icons
(clean mark on solid background) than the prior pattern-fill version.

Source: https://ethglobal.b-cdn.net/organizations/x59d1/square-logo/default.png
Dimensions: 400x400 RGB.
The four JSON files in tests/integration/fixtures/superfluid-workflows/
are the canonical workflow definitions used to verify Superfluid actions
end-to-end against a deployed PR environment (k8s pod runtime, real
signer wallet, real Sepolia RPC):

- get-net-flow.json -- read demo, returned "0" against fUSDCx
- create-pool.json -- write demo, mined tx, emitted PoolCreated event
- wrap.json -- multi-step (web3.approve-token + superfluid.wrap)
- grant-flow-operator-quirky.json -- known-quirky write reference

protocol-superfluid-workflow-fixtures.test.ts validates each fixture
against the live protocol registry: asserts every Superfluid action
slug exists in protocols/superfluid.ts, _protocolMeta agrees with
action.contract/function/type, network is in SUPERFLUID_CHAIN_IDS, and
all required inputs are present in config. Pure metadata, no RPC, runs
in CI unconditionally. If anyone removes/renames an action or drops a
chain, these fixtures fail loudly instead of silently rotting.

21 assertions across 4 fixtures; all pass against the current registry.
tests/scripts/run-fixture.ts -- Loads a workflow fixture JSON, POSTs to
a target deploy's /api/workflows/create, /execute, polls, and prints
the per-step trace plus final output. Auth is browser-cookie-based
because PR hosts sit behind Cloudflare Access (the kh CLI's API key
bypasses better-auth but not CF Access). Supports INTEGRATION_ID
override for replaying fixtures captured in another org.

tests/scripts/fund-test-wallet.ts -- Reads the team funder PK from
TechOps/.secrets/WEB3.txt (or FUNDER_PK_PATH override), sends SepETH
and optionally mints fUSDC via the permissive Sepolia mint() function
on the fake-USDC contract. Used to bootstrap the keeperhub-managed
signer wallet before running write fixtures.

Both scripts are self-contained, run via `pnpm tsx tests/scripts/<name>.ts`,
and follow the existing scripts/ convention (header doc, env-var
configuration, no implicit defaults for credentials). They make the
existing live-test procedure reproducible without ad-hoc curl pipes.
…sts/scripts/

The previous commit registered tests/scripts/e2e-superfluid-sepolia.ts
and tests/scripts/verify-superfluid-addresses.ts but didn't remove the
originals at scripts/, leaving the same blob tracked at two paths.
Removes the originals so the move is complete.

Also updates the doc comment in protocols/superfluid.ts that referenced
the old scripts/ path.
…ment

fix(agentic-wallet): allow data-chain workflow listings to accept payment (KEEP-432)
…e path

fund-test-wallet.ts previously read the funder key by parsing
"PK = <hex>" out of a file at FUNDER_PK_PATH (defaulting to
TechOps/.secrets/WEB3.txt). That coupled the script to a specific
filesystem layout and made it harder to use from CI or a different
mega-repo position.

Now reads FUNDER_PK directly from the env, accepts both 0x-prefixed
and bare hex, validates the format up front, and drops the fs/path
imports along with the regex parse. Same env-only convention the
other tests/scripts/ helpers use.

Source the value from a secrets manager, a gitignored .envrc, or
your shell environment -- never check it in.
… scripts

The chain seed file (scripts/seed/seed-chains.ts) already imports its
RPC URLs from lib/rpc/rpc-config.ts, but three test scripts duplicated
URLs inline:

- tests/scripts/verify-superfluid-addresses.ts had its own per-chain
  CHAIN_RPC map with hand-picked URLs (and chose different primaries
  than the lib for ETH/Polygon).
- tests/scripts/fund-test-wallet.ts hardcoded the Sepolia URL.
- tests/scripts/e2e-superfluid-sepolia.ts hardcoded the Sepolia URL.

Fix: every test script now imports PUBLIC_RPCS from lib/rpc/rpc-config.ts
and references the same constants the seed file does. Updating an RPC
URL is a one-line change in the lib that benefits every caller.

lib/rpc/rpc-config.ts gains an OPTIMISM_MAINNET entry in PUBLIC_RPCS
(the verify script needs it; Superfluid runs on Optimism). No
CHAIN_CONFIG entry yet because no keeperhub-supported feature uses
Optimism -- add one when the chain becomes a registered choice in
the workflow builder.
Removes tests/scripts/ from this PR -- the four scripts (run-fixture,
fund-test-wallet, e2e-superfluid-sepolia, verify-superfluid-addresses)
land in TechOps/scripts/ instead. They drove the manual PR-deploy
verification but are tooling, not protocol or test code, so keeping
them in the keeperhub PR was bloat.

Side effects of the move:

- lib/rpc/rpc-config.ts loses its OPTIMISM_MAINNET entry. It was added
  for the verify script which is leaving; no keeperhub-supported feature
  uses Optimism yet, so the lib stays focused on registered chains.

- protocols/superfluid.ts docstring updated to point at the new TechOps
  location for the bytecode-check script (and to flag the manual sync
  requirement -- the script keeps inline copies of the forwarder
  addresses and chain-id list now, so adding a chain means updating
  both files).

- Fixes the typecheck failure on the previous commit: fund-test-wallet
  used a `0n` BigInt literal which is unavailable at the project's
  ES2017 target. The script now lives in TechOps where it's not bound
  by keeperhub's tsconfig, but the pattern is fixed there too
  (BigInt(0)).

Coverage remaining in keeperhub:

- tests/integration/protocol-superfluid-onchain.test.ts (17 tests)
- tests/integration/protocol-superfluid-workflow-fixtures.test.ts
- tests/integration/fixtures/superfluid-workflows/ (4 JSON fixtures)
- tests/unit/superfluid-protocol.test.ts (39 unit tests)
…ollowup

feat: KEEP-434 use defaultFallbackWss when primary fails, add probe timeout
…ocol

feat: KEEP-415 add Superfluid protocol (takeover of #1106)
…EEP-442)

HTTP Request `endpoint` (and `httpHeaders`/`httpBody`) string fields
already pass through the workflow template substitution layer like
every other config string -- but `{{@prep:Prep.url}}` was resolving to
the empty string when `prep` was a `code/run-code` returning
`{ url: "..." }`. The bug shows up at HTTP request validation as
`URL is required`.

Root cause: `resolveFromOutputData` only unwraps the HTTP-style
`{ data: ... }` wrapper. `runCodeStep` returns `{ success, result, logs }`,
so when a downstream template references `Prep.url` (no explicit
`result.` prefix) the resolver finds neither `data.url` nor `data.data.url`
and falls back to the "missing path" branch, which substitutes "".

Fix: extend `resolveFromOutputData` with a `.result` fallback that
mirrors the existing `.data` fallback. Backward-compatible (`Prep.result.url`
still resolves directly via the top-level path) and also fixes any other
string field that references a code/run-code output's inner field
(protocol-action arg fields, downstream `code` strings, etc.).

Unblocks the dynamic Bridge Route Optimizer / MEV-Aware Swap Quote
workflows on the catalog roadmap.

- Add `hasNestedResultShape` helper alongside `hasNestedDataShape`
- Walk `.data` then `.result` so HTTP responses still take precedence
  when both wrappers are present
- Export `processTemplates` and `resolveFromOutputData` so the new
  unit test can drive the executor's substitution layer directly
- Add `tests/unit/http-request-template-substitution.test.ts` covering
  the bug repro plus header/body templating and ordering precedence

Verified end-to-end on local dev: trigger -> prep (code/run-code
returning {url}) -> across (HTTP GET with endpoint={{@prep:Prep.url}})
returned a real Across API response.
eskp and others added 2 commits May 6, 2026 17:04
Address review feedback on PR #1147:

- Tighten `hasNestedDataShape` to also reject `data === null`, matching
  `hasNestedResultShape` -- removes the asymmetry the reviewer flagged
  and keeps the type guard honest (the runtime was already null-safe via
  resolveConfigFieldPath, but the predicate now matches behavior).
- Pin the intentional `.data` -> `.result` fall-through with an explicit
  test, so future readers know the behavior on outputs that carry both
  wrappers is by design.
- Document the existing limitation: a primitive `.result` (e.g. a
  code/run-code that returned a bare string) is not unwrapped because
  the fallback can only walk into objects. Whole-output references
  still resolve via top-level.
- Cover null-wrapper guards for both `.data` and `.result`.

Tests: 11 -> 15 cases, all pass. No code path changes beyond the
hasNestedDataShape predicate; existing 3696 unit tests still pass.
…-template-substitution

fix(executor): resolve {{@}} templates referencing run-code output (KEEP-442)
@eskp eskp merged commit a7385e5 into prod May 6, 2026
32 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants