Skip to content

[codex] Prepare Osprey sandbox deployment#1

Merged
josephgoksu merged 6 commits into
mainfrom
codex/prepare-osprey-sandbox
May 25, 2026
Merged

[codex] Prepare Osprey sandbox deployment#1
josephgoksu merged 6 commits into
mainfrom
codex/prepare-osprey-sandbox

Conversation

@josephgoksu
Copy link
Copy Markdown
Member

@josephgoksu josephgoksu commented May 25, 2026

Summary

Prepare Osprey for a Coolify-hosted sandbox deployment and customer handoff.

This PR hardens the sandbox API path, Docker runtime, customer documentation, and assurance workflow so Osprey can be deployed as a Dockerfile app and later verified at sandbox.osprey.opensource.finance after DNS/Coolify are live.

What Changed

  • Hardened /evaluate, persistence, tenant isolation, duplicate transaction handling, JSON validation, rule/typology mutation, and admin-token protection.
  • Added sandbox customer docs, rule/typology authoring docs, Coolify checklist, OpenAPI contract, example payloads, and customer sandbox flow image.
  • Added Docker/Coolify readiness: non-root runtime, /app/data volume, healthcheck, reproducible build args, and env templates.
  • Added sandbox assurance tooling: assure-sandbox.sh, verify-sandbox.sh, build-arg helper, and a GitHub Actions gate.
  • Added contract drift checks for OpenAPI refs/examples, external request examples, server routes, admin security metadata, docs links, Coolify templates, Docker image metadata, and runtime API behavior.

Validation

Latest full local gate passed:

VERSION=docker-metadata-pr-ready ./scripts/assure-sandbox.sh

That covered JSON examples, OpenAPI contract checks, docs links, Coolify templates, shell syntax, go test ./..., go vet ./..., go test -race ./..., HTTP integration tests, Docker build, Docker image metadata, and Docker-backed sandbox verification.

Deployment Note

Public verification at https://sandbox.osprey.opensource.finance is intentionally not part of this PR evidence yet because the domain has not been deployed/configured in Coolify. After deployment, run:

OSPREY_URL=https://sandbox.osprey.opensource.finance \
OSPREY_ADMIN_TOKEN=<admin-token> \
EXPECTED_STATUS=healthy \
EXPECTED_MODE=detection \
EXPECTED_VERSION=<deployed-version> \
TENANT_ID=demo-client \
./scripts/verify-sandbox.sh

Greptile Summary

This PR prepares Osprey for a customer-facing Coolify sandbox deployment by hardening the API, adding Docker runtime hardening, tenant/duplicate-transaction enforcement, admin-token protection, and an extensive assurance gate. The scope is large (42 files, ~4500 net additions) spanning runtime Go changes, a SQLite schema migration, shell scripts, and new documentation.

  • API hardening: Moves TransactionRequest/EvaluationResponse types into handler.go, adds body-size limiting, DisallowUnknownFields, RFC3339 timestamp parsing, duplicate-transaction detection against the new composite PK, and full CRUD for rules/typologies with live engine reload behind AdminMiddleware.
  • Persistence: Upgrades the transactions table to a composite PRIMARY KEY (id, tenant_id) via an in-place SQLite migration and fixes all previously-silenced json.Unmarshal errors throughout the repository layer.
  • DevOps: Dockerfile switches to a non-root runtime user, adds /app/data volume, reproducible build-arg injection, and a working healthcheck; a new GitHub Actions workflow drives the full assurance suite on every PR.

Confidence Score: 3/5

Not safe to merge to a customer-facing sandbox in its current state — the admin mutation endpoints are unprotected when the token env var is absent, and the duplicate-transaction check can return 500 under concurrent load where it should return 409.

The Evaluate handler performs a non-atomic GetTransaction → SaveTransaction check; under concurrent duplicate submissions the DB constraint fires instead, and the handler surfaces a 500 rather than the intended 409. DeleteTypology maps every repository error (including real DB failures) to 404, hiding operational problems from callers. AdminMiddleware has an explicit no-op path when the token is unconfigured, leaving all rule and typology mutation endpoints wide open on any deployment that omits the env var — a realistic risk for a sandbox handed to a customer without a hardening checklist being followed.

internal/api/handler.go (TOCTOU race and DeleteTypology error mapping), internal/api/middleware.go (AdminMiddleware bypass and misleading 401 message), internal/repository/repository.go (migration silent-skip edge case)

Security Review

  • Unauthenticated admin surface when token not configured: AdminMiddleware is a transparent pass-through when OSPREY_ADMIN_TOKEN is empty. Any caller can inject or overwrite CEL rule expressions into the live evaluation engine via POST /rules, PUT /typologies/{id}, etc. This is insecure by default for a customer-facing deployment.
  • CORS wildcard exposes admin token header: Access-Control-Allow-Origin: * combined with X-Osprey-Admin-Token in Access-Control-Allow-Headers means any browser origin can preflight and issue admin requests if the token is known or absent.
  • No rate limiting on admin or evaluate endpoints: Brute-force against the admin token or high-volume transaction flooding has no server-side throttle.

Important Files Changed

Filename Overview
internal/api/handler.go Major rewrite adding duplicate-tx detection, JSON body size limits, inline request/response types, rule/typology CRUD with engine reload. TOCTOU race on duplicate-check and DeleteTypology error-handling bug need fixing.
internal/api/middleware.go Adds AdminMiddleware with constant-time token comparison, trims tenant header, fixes CORS headers. Admin endpoints are fully unprotected when token env var is absent; misleading 401 error message on invalid token.
internal/repository/repository.go Adds SQLite schema migration to composite PK, fixes previously-ignored json.Unmarshal errors. Migration skip logic has an edge case that can silently leave the table without the composite key on unexpected schemas.
internal/api/server.go Route groups AdminMiddleware correctly on mutation endpoints; read endpoints remain public as expected.
Dockerfile Adds non-root user, /app/data volume, healthcheck improvement, and reproducible build-arg injection. Removes git dependency from builder stage. Clean and correct.
cmd/osprey/main.go Adds validated OSPREY_MODE override, proper comma-separated OSPREY_TENANTS parsing, and error-return pattern for env overrides. GlobalTenantID constant is duplicated here and in handler.go.
internal/repository/schema.go Schema updated to composite PRIMARY KEY (id, tenant_id) for proper tenant isolation on the transactions table.
.github/workflows/sandbox-assurance.yml New CI gate running assure-sandbox.sh on PRs and main. Ruby requirement is unconventional for a Go project and may slow CI cold starts.
scripts/assure-sandbox.sh Comprehensive gate covering JSON validation, OpenAPI contract, Go tests, race detector, integration tests, Docker build, and sandbox verification. Ruby dependency for OpenAPI YAML parsing is non-standard.

Sequence Diagram

sequenceDiagram
    participant Client
    participant CORS as CORSMiddleware
    participant Tenant as TenantMiddleware
    participant Admin as AdminMiddleware
    participant Handler
    participant Repo as SQLRepository
    participant Engine as RulesEngine

    Client->>CORS: POST /evaluate (X-Tenant-ID)
    CORS->>Tenant: pass
    Tenant->>Handler: ctx + tenantID

    Handler->>Repo: GetTransaction(tenantID, txID)
    Note over Handler,Repo: TOCTOU: not atomic with SaveTransaction
    Repo-->>Handler: ErrNotFound
    Handler->>Repo: SaveTransaction(tenantID, tx)
    Repo-->>Handler: ok

    Handler->>Engine: EvaluateAll(input)
    Engine-->>Handler: ruleResults
    Handler->>Repo: SaveEvaluation(tenantID, eval)
    Repo-->>Handler: ok
    Handler-->>Client: 200 EvaluateResponse

    Client->>CORS: POST /rules (X-Osprey-Admin-Token)
    CORS->>Tenant: pass
    Tenant->>Admin: AdminMiddleware
    Note over Admin: No-op if token empty (insecure by default)
    Admin->>Handler: CreateRule
    Handler->>Engine: ValidateRule + ReloadRules
    Handler-->>Client: 201 Created
Loading

Fix All in Claude Code

Prompt To Fix All With AI
Fix the following 6 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 6
internal/api/handler.go:174-193
**TOCTOU Race: Concurrent Duplicate Submissions Return 500 Instead of 409**

The `GetTransaction` + `SaveTransaction` check is not atomic. Two concurrent requests carrying the same `txID` can both clear the existence check before either insert executes; the second insert then hits the composite-PK unique constraint and `SaveTransaction` returns a generic DB error, causing the handler to respond with `500 "failed to persist transaction"` rather than `409 "transaction id already exists"`.

The safest fix is to remove the application-level pre-check entirely and let `SaveTransaction` detect the constraint violation, converting it to `repository.ErrDuplicateTransaction` (or similar sentinel) that the handler maps to 409.

### Issue 2 of 6
internal/api/handler.go:932-938
**DeleteTypology Masks All DB Errors as 404**

Any error returned from `repo.DeleteTypology` — including connection failures, timeout errors, or query panics — is currently surfaced to the caller as HTTP 404 `"typology not found"`. A real "not found" path in the repository should return `repository.ErrNotFound`; all other errors should map to 500 so clients and operators get an accurate signal about what went wrong.

### Issue 3 of 6
internal/api/middleware.go:63-81
**Admin Endpoints Fully Unprotected When Token Is Not Configured**

When `OSPREY_ADMIN_TOKEN` is empty (e.g., a misconfigured or rushed sandbox deployment), `AdminMiddleware` silently passes every request through on line 66-68, leaving `POST /rules`, `POST /rules/reload`, `POST /typologies`, `PUT /typologies/{id}`, `DELETE /typologies/{id}`, and `POST /typologies/reload` completely open to any caller. For a customer-facing sandbox this is insecure by default — an unauthenticated actor can inject arbitrary CEL rules into the live evaluation engine. Consider at minimum logging a loud startup warning when the token is empty, or refusing to start in a mode where the admin token is unset.

### Issue 4 of 6
internal/api/middleware.go:72-73
**Misleading Error Message on Invalid Token**

The 401 response body says `"admin token is required"` even when a token was supplied but is wrong. Clients that are providing the header but have a rotated or mistyped token will be confused by this message. Change the error body to `"admin token is invalid or missing"` to accurately describe both scenarios without leaking which condition triggered it.

### Issue 5 of 6
cmd/osprey/main.go:248-249
**`GlobalTenantID` Defined in Two Packages**

`const GlobalTenantID = "*"` appears in both `cmd/osprey/main.go:249` and `internal/api/handler.go:557`. These are independent string constants in separate packages; if one is ever changed (e.g., to an empty string, a UUID, or a different sentinel), rule lookups and data access from the two packages will silently diverge. The constant belongs in `internal/domain` and should be imported by both consumers.

### Issue 6 of 6
internal/repository/repository.go:101-120
**Migration Skips Silently on Unexpected Schema States**

The second early-exit condition, `primaryKey["id"] == 0 || primaryKey["tenant_id"] > 0`, returns `nil` (i.e., skips the migration) when `id` is not a primary-key column in the existing table. This could silently leave the table without the required composite key if the DB was provisioned with a non-standard schema. Additionally, if `PRAGMA table_info(transactions)` returns an empty result set (e.g., schema not yet created due to an edge in migration ordering), the map will contain zero values and the second condition will also trigger a silent skip. A defensive check for an empty `primaryKey` map — failing loudly rather than skipping — would make misconfigured deployments visible at startup rather than at query time.

Reviews (1): Last reviewed commit: "Prepare Osprey sandbox deployment" | Re-trigger Greptile

Greptile also left 6 inline comments on this PR.

@josephgoksu josephgoksu marked this pull request as ready for review May 25, 2026 12:16
Comment thread internal/api/handler.go Outdated
Comment thread internal/api/handler.go Outdated
Comment thread internal/api/middleware.go
Comment thread internal/api/middleware.go Outdated
Comment thread cmd/osprey/main.go Outdated
Comment thread internal/repository/repository.go
@josephgoksu josephgoksu merged commit b4db8ad into main May 25, 2026
1 check passed
@josephgoksu josephgoksu deleted the codex/prepare-osprey-sandbox branch May 25, 2026 13:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant