Skip to content

fix(kiloclaw) inbound email hook#3123

Open
St0rmz1 wants to merge 3 commits intomainfrom
fix/kiloclaw-inbound-email-hook
Open

fix(kiloclaw) inbound email hook#3123
St0rmz1 wants to merge 3 commits intomainfrom
fix/kiloclaw-inbound-email-hook

Conversation

@St0rmz1
Copy link
Copy Markdown
Contributor

@St0rmz1 St0rmz1 commented May 7, 2026

Summary

Every inbound email to a bot has been failing with HTTP 400 since the OpenClaw 2026.4.23 image finished rolling out. OpenClaw refuses any hook mapping that resolves sessionKey from a request body unless hooks.allowRequestSessionKey: true is set. The cloudflare-email-inbound mapping uses sessionKey: '{{payload.sessionKey}}' so the platform worker can compute a stable key like inbound-email:YYYY-MM-DD-<slug> and coalesce one thread into a single agent session. Without the flag, OpenClaw returned {"error":"Hook rejected","hookStatus":400} for every email, the controller faithfully proxied that, and the bot never woke.

This change sets the flag in generateBaseConfig and backfills it on existing configs via sanitizeExistingConfigBeforeDoctor, so running instances pick up the fix on the next controller restart with no manual reprovision.

While verifying the fix, two adjacent issues surfaced and are repaired in the same PR:

  1. The kiloclaw-inbound-email worker was missing logpush: true, so its trace events never reached the cloudflare-logpush Axiom dataset even though observability.enabled was set. The other kiloclaw workers already had the flag. Added it and a short README explaining the requirement.
  2. The platform worker captures the upstream rejection body but truncated it to 500 chars and stored it as a single opaque string. Bumped truncation to 2000 chars and promoted the parsed JSON error value to a structured log key controllerErrorMessage, so the next downstream regression is filterable in Axiom without needing to SSH into a running instance.

Verification

  • Reproduced the regression: POST to http://127.0.0.1:3001/hooks/email from inside a Fly instance returned HTTP 400 with the exact sessionKey is disabled for externally supplied hook payload values body, confirming the root cause before applying any code change.
  • Confirmed blast radius: the same endpoint without payload.sessionKey returned 200 with an automatically generated session id, and unmapped paths like /hooks/test returned 404. The flag only widens acceptance for the existing inbound email mapping.
  • Grep across the repo confirmed cloudflare-email-inbound is the only mapping that templates sessionKey from payload, so other webhook flows are not in scope.
  • All controller test files pass (210 cases). New tests assert the flag is set in the generated config, that the helper overrides an explicit false back to true, and that the bootstrap sanitize step backfills the flag on an existing config that lacked it.
  • pnpm run lint, pnpm run typecheck, and pnpm run format:check all clean.
  • After deploy, send a real email to a live alias, watch the platform worker log line inbound email controller response { status: 200 } in Axiom, and confirm the bot opens an inbound-email:YYYY-MM-DD-<slug> session.
  • After deploy, query Axiom for ScriptName == "kiloclaw-inbound-email" to confirm the worker logs are now reaching the dataset.
  • Drain kiloclaw-inbound-email-dlq once normal delivery is confirmed.

Visual Changes

N/A

Reviewer Notes

ensureInboundEmailHookFlags deliberately overrides any existing value of hooks.allowRequestSessionKey, including an explicit false. The reasoning lives in the function comment and a vitest case: the cloudflare-email-inbound mapping is force installed by generateBaseConfig on every run, so the flag it depends on must converge alongside the mapping. An operator who needs to disable inbound email entirely should clear the kiloclaw_instances.inbound_email_enabled column, not flip this flag.

A separate regression with the same symptom timeline (Apr 22 success, May 7 failure shown on the trigger requests page) was traced during this investigation to the same feat(kilo-chat): rip out stream chat commit. That regression sits in services/webhook-agent-ingest: the queue consumer at queue-consumer.ts:175 still POSTs to /api/platform/send-chat-message, a route the same commit deleted with no replacement. The kiloclaw worker now returns "Authentication required" for that unmatched path, and captured webhook requests end up in the failed state. That fix is intentionally scoped to a separate PR because it requires reimplementing the chat delivery on top of the chat plugin and warrants coordination with the original author.

@kilo-code-bot
Copy link
Copy Markdown
Contributor

kilo-code-bot Bot commented May 7, 2026

Code Review Summary

Status: No Issues Found | Recommendation: Merge

Files Reviewed (4 files)
  • services/kiloclaw-inbound-email/README.md
  • services/kiloclaw/controller/src/bootstrap.ts
  • services/kiloclaw/controller/src/config-writer.test.ts
  • services/kiloclaw/controller/src/config-writer.ts

Reviewed by gpt-5.5-2026-04-23 · 661,817 tokens

@St0rmz1 St0rmz1 changed the title Fix/kiloclaw inbound email hook fix(kiloclaw) inbound email hook May 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant