Skip to content

feat(kiloclaw): support Northflank tier resize#3115

Merged
pandemicsyn merged 5 commits intomainfrom
florian/feat/northflank-resize
May 7, 2026
Merged

feat(kiloclaw): support Northflank tier resize#3115
pandemicsyn merged 5 commits intomainfrom
florian/feat/northflank-resize

Conversation

@pandemicsyn
Copy link
Copy Markdown
Contributor

@pandemicsyn pandemicsyn commented May 7, 2026

Summary

  • Adds Northflank support for existing-instance tier upgrades by growing the Northflank volume when needed and patching the deployment service compute plan before persisting the accepted desired tier state.
  • Extends the provider adapter with an optional resizeRuntime hook so provider-specific resize semantics stay inside the provider layer while Fly and docker-local keep their existing behavior.
  • Updates the admin resize flow to skip Fly-style stop/start orchestration for Northflank and show rollout-specific copy.

Verification

deployed regular instance , upgraded to 4x8

Visual Changes

Before After
Admin resize dialog used Fly-style stop/start copy for every provider. Screenshot pending. Admin resize dialog shows Northflank rollout-specific copy and storage growth warning for Northflank instances. Screenshot pending.

Reviewer Notes

  • Main risk area is Northflank API semantics: volume update returns an empty success response with no separate completion status, so the Worker treats HTTP 200 as accepted and persists desired tier state after Northflank also accepts the compute-plan patch.
  • Northflank resize intentionally does not require the instance to be stopped; Fly and docker-local resize behavior is unchanged.
  • Billing remains tier-unaware. This only updates runtime tier state and the denormalized instance_type read cache.

@kilo-code-bot
Copy link
Copy Markdown
Contributor

kilo-code-bot Bot commented May 7, 2026

Code Review Summary

Status: 1 Issue Found | Recommendation: Address before merge

Overview

Severity Count
CRITICAL 0
WARNING 1
SUGGESTION 0
Issue Details (click to expand)

WARNING

File Line Issue
services/kiloclaw/src/providers/northflank/index.ts 875 Resize is persisted before the Northflank rollout is confirmed
Files Reviewed (13 files)
  • .specs/kiloclaw-controller.md - 0 issues
  • apps/web/src/app/admin/components/KiloclawInstances/KiloclawInstanceDetail.tsx - 0 issues
  • services/kiloclaw/src/durable-objects/kiloclaw-instance.test.ts - 0 issues
  • services/kiloclaw/src/durable-objects/kiloclaw-instance/index.ts - 0 issues
  • services/kiloclaw/src/northflank/client.test.ts - 0 issues
  • services/kiloclaw/src/northflank/client.ts - 0 issues
  • services/kiloclaw/src/northflank/config.test.ts - 0 issues
  • services/kiloclaw/src/northflank/config.ts - 0 issues
  • services/kiloclaw/src/providers/northflank/index.test.ts - 0 issues
  • services/kiloclaw/src/providers/northflank/index.ts - 1 issue
  • services/kiloclaw/src/providers/types.ts - 0 issues
  • services/kiloclaw/wrangler.jsonc - 0 issues

Fix these issues in Kilo Cloud


Reviewed by gpt-5.5-20260423 · 1,220,687 tokens

targetTier,
deploymentPlan,
});
await patchDeploymentService(config, providerState.projectId, providerState.serviceId, {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: Resize is persisted before the Northflank rollout is confirmed

resizeMachine() persists the new tier immediately after this patch resolves, but patchDeploymentService only confirms the update was accepted. If the deployment later fails, the DO still records the larger instanceType, machineSize, and volumeSizeGb, and the admin UI reports completion even though runtime hardware may not have changed. This path should wait for the deployment to complete (as startRuntime/restartRuntime do) before returning success or persisting tier state.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t think we should wait here. The persisted tier is the desired state after provider acceptance, its also consistent with the Fly resize path, which persists instanceType/machineSize before the restarted Fly machine proves it is running on the new CPU/RAM.

If Northflank accepts the patch and later fails rollout, that is runtime/provider health drift for inspect/reconcile to surface, not a reason to keep the desired tier old.

Copy link
Copy Markdown
Contributor

@jeanduplessis jeanduplessis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall. I agree with the existing Northflank resize thread as a non-blocking improvement suggestion: waiting for rollout completion before marking the resize done would reduce provider/state drift when Northflank accepts a patch but the rollout later fails. I did not add a duplicate line comment because that thread already covers the point.

@pandemicsyn pandemicsyn merged commit a7eb855 into main May 7, 2026
13 checks passed
@pandemicsyn pandemicsyn deleted the florian/feat/northflank-resize branch May 7, 2026 18:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants