Skip to content

feat(example): Week 2 agent flow - build/typecheck unblock + deploy-pending UX + graceful errors#25

Merged
mike-diamond merged 11 commits into
mainfrom
feat/arbitrum-week2-agent-flow
Jun 4, 2026
Merged

feat(example): Week 2 agent flow - build/typecheck unblock + deploy-pending UX + graceful errors#25
mike-diamond merged 11 commits into
mainfrom
feat/arbitrum-week2-agent-flow

Conversation

@mike-diamond

@mike-diamond mike-diamond commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

What

Week 2 of the Arbitrum London Buildathon agent demo (scenario A - Pendle yield swap on Arbitrum Sepolia). Reviews the agent loop, clears the blockers that stopped the demo building/running end to end, trims API spend, adds the reproducible real-tx-hash path the AI Agentic category requires, and a hackathon code-review polish pass.

Changes

  • Unblock typecheck/build on TypeScript 6 - the 5->6 bump (chore(deps-dev): bump typescript from 5.9.3 to 6.0.3 #13) regressed this example (CI builds packages/** only). Added globals.d.ts ambient declarations (TS2882) and switched the tool defs to satisfies Anthropic.Tool (TS2769).
  • Deploy-pending banner - DeployPendingBanner reads the checkIs*Deployed predicates and shows a preview-mode banner on /flow-a while deployed.json is placeholder. Renders nothing once contracts are live.
  • Agent route - model claude-opus-4-5 -> claude-haiku-4-5 (simple structured extraction, ~1/5 cost, no quality loss); intent stated before the single-shot tool call; cost guard skips the model call entirely in preview mode (spends nothing pre-deploy).
  • Real on-chain tx hashes - SmokeExecuteEnvelope.s.sol reproduces the dApp's executeEnvelope (agent-sign + execute through the gate) as a reproducible forge broadcast, chain-agnostic so one command lands a real tx on Arbitrum Sepolia and on Robinhood (sponsor bonus - the only way to get a Robinhood tx while scenario C has no UI). DEPLOY.md gains the Robinhood router deploy + allow-list and a step-7 capture flow; a new demo README.md carries the judge-facing "Live on-chain" tables. Rehearsed end to end on a local anvil.
  • Graceful env handling - a blank ANTHROPIC_API_KEY= returns the intended 503 "not set" instead of a raw 500 zod dump.
  • Code-review pass (hackathon polish):
    • Decompose PendleAgentChat 348 -> 237 lines: extracted utils/formatters.ts, utils/fetchDecoded.ts, and SignEnvelopeActions.tsx.
    • Accessibility - focus-visible rings on the chat input, copy button, explorer link, and decoded-args summary; aria-live on the thinking status; role=alert on errors; aria-label on the input.
    • Tailwind - 104 verbose bg-[color:var(--color-*)] arbitrary values -> theme shorthands (bg-accent, text-muted, ...) already registered in @theme inline; no config change, identical output.
    • Correctness/rules - explorer link label now matches the chain (Arbiscan vs Robinhood); double ternary split; key={index} -> stable crypto.randomUUID; isDeployed -> checkIsDeployed; removed sprint-diary milestone strings + past-dated "Phase 2 Day 10" text; blank lines before return; removed the void unused-import hack.

Verification

  • forge test: 20/20 · forge build: clean (incl. the smoke script)
  • Local anvil rehearsal: deploy gate + router -> allow-list -> SmokeExecuteEnvelope -> ONCHAIN EXECUTION SUCCESSFUL, real executeEnvelope tx hash produced
  • tsc --noEmit: 0 errors · eslint app src: 0 errors · next build: clean, 9 pages
  • Confirmed the Tailwind shorthands emit real CSS (bg-accent{background-color:var(--tx-color-primary,...)}, bg-card/40 -> color-mix)

Deploy runbook for Mike (needs a funded testnet key you hold)

Full copy-paste in examples/arbitrum-london/DEPLOY.md. After deploy, run the smoke script (step 7) per chain and paste the addresses + tx hashes into README.md. Only the deployer address needs gas (~0.05 ETH per chain on Arbitrum Sepolia + Robinhood); the agent signer signs off-chain (no gas).

Left for Week 3

  • Scenario C (RWA on Robinhood) UI is still a placeholder; the policy-gate execution on Robinhood is proven via the smoke script.
  • Optional polish: a real two-turn tool loop so the agent narrates the prepared envelope.
  • Pre-existing: pnpm lint lints vendored contracts/lib/** (noise) - flagged separately.

Do not merge - review first.

🤖 Generated with Claude Code

TypeScript 6 (merged in #13) regressed the example app typecheck, which CI does not cover (CI builds packages only). Two errors blocked next build:

- TS2882: side-effect imports of @txkit/themes/base, @txkit/themes/dark and globals.css need ambient module declarations. Added globals.d.ts.
- TS2769: the as-const tool definitions typed required as a readonly tuple, not assignable to the Anthropic SDK Tool input_schema required string[]. Switched to satisfies Anthropic.Tool, which keeps literal types without readonly.
Before deploy, deployed.json holds placeholder addresses and preparing an envelope returns a 503; the flow only surfaced that reactively after a visitor typed a prompt. DeployPendingBanner reads the checkIsAgentPolicyGateDeployed / checkIsMockPendleRouterDeployed predicates and shows a preview-mode banner up front, rendering nothing once both contracts hold real addresses.
Bump the agent model from claude-opus-4-5 to claude-opus-4-8 (current default; same request surface, none of the removed params are in use).

The system prompt told the model to summarise after calling the tool, but the route is single-shot - it never feeds a tool result back, so that turn never happens. Reworded to state intent in one sentence before the tool call, which lands in the same response and shows up as the assistant reply.
A blank ANTHROPIC_API_KEY= (from clearing a copied .env.example value, or a host that injects empty vars) passed through as an empty string and failed z.string().min(1), surfacing a raw 500 zod dump instead of the route intended friendly not-set 503. Coerce blank values to undefined before parsing so optional fields stay optional and defaulted fields fall back to their default.
The post-deploy sync step pointed at packages/tx-decoder/src/registry/data/agent-policy-gate.json, which no longer carries AgentPolicyGate after the alpha.4 cleanup. The address lives in examples/arbitrum-london/decoder-data/agent-policy-gate.json, matching DEPLOY.md, the deploy script comment, and the decode route.
@vercel

vercel Bot commented Jun 2, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
txkit-docs Ready Ready Preview, Comment Jun 3, 2026 1:39pm
txkit-land Ready Ready Preview, Comment Jun 3, 2026 1:39pm
txkit-story Ready Ready Preview, Comment Jun 3, 2026 1:39pm

Two changes in /api/agent, both reducing Claude cost without touching demo quality:

- Model claude-opus-4-8 -> claude-haiku-4-5. The route does one thing: turn a natural-language intent into a single validated tool call against known token addresses. That is simple structured extraction, not reasoning - Haiku 4.5 handles it at ~1/5 the price ($1/$5 vs $5/$25 per 1M tokens), roughly $0.0015 per interaction.
- Skip the model call entirely when the scenario A contracts are still placeholder addresses. Previously a visitor typing in preview mode would spend a Claude call that then 503s because the envelope cannot be built. Now the deploy check short-circuits to 503 before the API call, so preview mode costs nothing.
@vercel

vercel Bot commented Jun 2, 2026

Copy link
Copy Markdown

Deployment failed with the following error:

Resource is limited - try again in 24 hours (more than 100, code: "api-deployments-free-per-day").

Learn More: https://vercel.com/mikediamonds-projects?upgradeToPro=build-rate-limit

…ain tx hashes

The AI Agentic category requires real tx hashes, not preview-only. The UI sign button already lands a real executeEnvelope tx, but Robinhood Chain has no UI flow (scenario C is a placeholder), so a script is the only way to get a verifiable tx there.

SmokeExecuteEnvelope reproduces exactly what the dApp does - build the inner MockPendleRouter swap, agent-sign the EIP-712 envelope, execute through the gate - and is chain-agnostic, so the same command lands a real tx on Arbitrum Sepolia and on Robinhood (sponsor bonus). DEPLOY.md gains the router deploy + allow-list on Robinhood and a step-7 capture flow. The full deploy -> allow-list -> smoke tx path was rehearsed on a local anvil (executeEnvelope succeeded, real tx hash produced).
Judge-facing overview plus a Live on-chain section with placeholder tables for both chains contract addresses and example executeEnvelope tx hashes, filled in after deploy. States what the demo proves (agent prepares, human verifies, gate enforces) and the honest scope (deterministic mock router; scenario C UI still roadmap).
…actions

Split the 348-line chat component (now 237). Extracted the pure formatters plus a new resolveExplorerLabel to utils/formatters.ts, the /api/decode call and its DecodedCall type to utils/fetchDecoded.ts, and the sign/reject/tx-link block to SignEnvelopeActions.tsx.

Also from review: split the double ternary in the error path into named consts, stable crypto.randomUUID keys instead of array index, aria-live on the thinking status, role=alert on errors, aria-label plus focus-visible on the input, and the explorer link label now matches the chain (Arbiscan vs Robinhood explorer) instead of a hardcoded Arbiscan. Color classes use the Tailwind theme shorthands.
… and summary

The copy button, the explorer link, and the decoded-arguments summary had hover styles but no keyboard focus indicator. Added a focus-visible ring matching the project standard so keyboard and switch-access users can see focus.
Replace 104 verbose bg-[color:var(--color-*)] arbitrary values with the Tailwind v4 theme shorthands (bg-accent, text-muted, border-border, ...) already registered in @theme inline - no config change, identical output.

Review cleanup folded in: rename isDeployed -> checkIsDeployed (predicate naming rule), drop the milestone sprint-diary strings and the past-dated Phase 2 Day 10 text from /api/agent responses and the RWA builder, neutralize past-deadline footer/comment text on flow-c, extract alignClass in ChatMessage, single blank line before return where missing, remove the void unused-import hack in envelope-builder, and trim a redundant decoder comment.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant