arkorlab · soleil-colza · Apr 28, 2026 · Apr 27, 2026 · Apr 27, 2026 · Apr 27, 2026
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -0,0 +1,90 @@
+# Contributing to Arkor
+
+Thanks for your interest! Arkor is in **alpha**: we're moving fast, breaking things on purpose, and the core idea (TypeScript-native fine-tuning for product engineers) is something we want to design *with* the people who'd use it. Issues, discussion, and PRs are all welcome.
+
+## Ways to help
+
+| Effort                  | What's most useful                                                                  |
+| ----------------------- | ----------------------------------------------------------------------------------- |
+| **5 min**               | Try the [Quickstart](README.md#quickstart) and [open an issue](https://github.com/arkorlab/arkor/issues/new) about anything that confused you, broke, or felt un-TypeScript. |
+| **An afternoon**        | Pick up a [`good first issue`](https://github.com/arkorlab/arkor/labels/good%20first%20issue) or send a small PR (doc fixes, template tweaks, error-message polish). |
+| **Ongoing**             | Hop into [Discord](https://discord.gg/YujCZYGrEZ) and tell us what model + dataset + workflow you wish worked. We use this to prioritize. |
+
+If you have an idea for a non-trivial change (new SDK factory, CLI command, Studio view), please open an issue first so we can align on the API shape before you write code.
+
+## Repo layout
+
+```
+arkor/
+├── packages/
+│   ├── arkor/              # SDK + CLI + bundled local Studio (published to npm)
+│   ├── create-arkor/       # `pnpm create arkor` scaffolder (published to npm)
+│   ├── cli-internal/       # private helpers shared by arkor + create-arkor
+│   └── studio-app/         # Vite + React SPA bundled into `arkor`
+├── e2e/cli/                # vitest-driven E2E suite for the scaffolder & build
+├── assets/                 # README / OG images
+└── turbo.json              # build / test orchestration
+```
+
+`cli-internal`, `studio-app`, and `e2e/cli` are private and never published.
+
+## Development setup
+
+Please use **Node.js 24 (Preferably the latest) ** and **pnpm 10.21+**.
+
+```bash
+git clone https://github.com/arkorlab/arkor.git
+cd arkor
+pnpm install
+pnpm build         # turbo run build (covers all packages)
+pnpm test          # unit tests across the monorepo
+pnpm typecheck     # tsc across the monorepo
+```
+
+To work on a specific package:
+
+```bash
+pnpm --filter arkor dev          # tsdown --watch on the SDK/CLI
+pnpm --filter @arkor/studio-app dev   # vite dev server for the Studio SPA
+pnpm --filter create-arkor dev   # tsdown --watch on the scaffolder
+```
+
+To run the E2E scaffolder/build suite (slow; spawns real CLIs in temp dirs):
+
+```bash
+pnpm --filter @arkor/e2e-cli test
+# Skip the `<pm> install` step inside fixtures:
+SKIP_E2E_INSTALL=1 pnpm --filter @arkor/e2e-cli test
+```
+
+## Trying your local build
+
+The fastest loop is to scaffold a fresh project pointing at the workspace build:
+
+```bash
+pnpm build
+cd /tmp && node /path/to/arkor/packages/create-arkor/dist/bin.mjs my-arkor-app
+cd my-arkor-app && pnpm dev
+```
+
+Studio runs at `http://127.0.0.1:4000` with a CSRF token injected per launch.
+
+## Pull request guidelines
+
+- **One concern per PR.** Smaller diffs land faster.
+- **Tests where the surface is testable.** SDK / CLI / scaffolder logic should have a vitest case. Studio UI changes can be PR'd with a screenshot or short clip.
+- **Breaking changes are fine** during alpha. We don't ship compatibility shims between `0.0.x` versions, so just note them in the PR description and the changelog stays honest.
+- **Don't reintroduce removed verbs.** `arkor train`, `arkor deploy`, `arkor jobs`, and `arkor logs` were removed deliberately. Training and deploying are TS configs that run when the entrypoint executes, not CLI verbs. The CLI surface is `dev` / `build` / `start` plus auth.
+
+## Reporting bugs and security issues
+
+- **Bugs**: [GitHub Issues](https://github.com/arkorlab/arkor/issues/new) with steps to reproduce, expected vs actual, and your Node + pnpm versions.
+- **Security**: please email security@arkor.ai instead of filing a public issue. We'll acknowledge within 48 hours.
+
+## Code of conduct
+
+Be kind, assume good faith, and keep technical disagreement technical. Anything else (harassment, personal attacks, exclusionary behavior) is grounds for being asked to leave. The maintainers' call is final.
+
+## License
+
+By contributing, you agree your contributions are licensed under the [MIT license](LICENSE.md).
diff --git a/README.md b/README.md
@@ -1,78 +1,194 @@
-# Arkor
-
-> Fine-tune and deploy open-weight models with TypeScript.
-
-Arkor is a TypeScript framework for improving and shipping custom open-weight
-models. The audience is product engineers who already build with TypeScript /
-Next.js and want custom model behaviour without standing up an ML
-infrastructure team. Arkor handles GPUs, fine-tuning, and serving underneath
-so the user's job stays "write some TypeScript".
-
-> Status: alpha (`0.0.1-alpha.0`). Public APIs may change without notice.
+<p align="center">
+  <picture>
+    <source media="(prefers-color-scheme: dark)" srcset="assets/logo-dark.svg">
+    <img src="assets/logo.svg" alt="Arkor" width="96">
+  </picture>
+</p>
+
+<h1 align="center">Arkor</h1>
+
+<h3 align="center">The TypeScript framework for fine-tuning open-weight LLMs</h3>
+
+<p align="center">
+  Ship custom open-weight models the same way you ship your TypeScript app.
+  Type-safe configs, hot reload, a local Studio (web UI) to start and watch runs, and managed GPUs.
+</p>
+
+<p align="center">
+  <a href="https://www.npmjs.com/package/arkor"><img src="https://img.shields.io/npm/v/arkor?label=arkor&color=000" alt="npm"></a>
+  <a href="LICENSE.md"><img src="https://img.shields.io/badge/license-MIT-000" alt="MIT"></a>
+  <img src="https://img.shields.io/badge/node-%E2%89%A522.6-000" alt="node ≥22.6">
+  <img src="https://img.shields.io/badge/status-alpha-orange" alt="alpha">
+  <a href="https://discord.gg/YujCZYGrEZ"><img src="https://img.shields.io/badge/discord-join-5865F2" alt="Discord"></a>
+</p>
+
+<p align="center">
+  <a href="https://arkor.ai/docs"><strong>Docs</strong></a> &nbsp;·&nbsp;
+  <a href="#quickstart"><strong>Quickstart</strong></a> &nbsp;·&nbsp;
+  <a href="#why-arkor"><strong>Why Arkor</strong></a> &nbsp;·&nbsp;
+</p>
+
+> [!WARNING]
+> Arkor is **alpha** (`0.0.1-alpha.0`). APIs change without notice. We're shipping in public, and feedback shapes what lands next.
+
+<!--
+  Demo media goes here once recorded:
+    - assets/demo-cli.gif       Terminalizer: pnpm create arkor → pnpm dev
+    - assets/demo-studio.gif    Screen recording: Run Training → loss curve → Playground
+-->
 
 ## Quickstart
 
 ```bash
-pnpm create arkor my-app
-cd my-app
-pnpm install
-pnpm arkor login       # Auth0 PKCE flow; --anonymous also works
-pnpm arkor dev         # opens the local Studio GUI on http://127.0.0.1:4000
+pnpm create arkor my-arkor-app
+cd my-arkor-app
+pnpm dev
 ```
 
-`arkor dev` is the primary surface — it starts a local Studio with hot
-reload over your TypeScript and a GUI for running training, inspecting jobs,
-and trying out checkpoints in a Playground.
+That's the whole setup. 
+**No signup required:** `arkor dev` opens **Studio**, a local web UI at `http://127.0.0.1:4000`, and silently bootstraps an anonymous workspace so you can fire off a real training run right away. 
 
-CLI-only flow (no GUI):
+Run `arkor login` later if you want to claim your work under an account.
 
-```bash
-pnpm arkor build       # bundles src/arkor/ into .arkor/build/index.mjs
-pnpm arkor start       # runs the build artifact on the cloud
+### Pick a template
+
+The scaffolder asks which template you want. 
+All three start from the same small open-weight base (`unsloth/gemma-4-E4B-it`) so the first run finishes quickly.
+
+| Template  | What it shows                                                       | Dataset                            |
+| --------- | ------------------------------------------------------------------- | ---------------------------------- |
+| `minimal` | The smallest working `createTrainer({ ... })` call.                 | `yahma/alpaca-cleaned` (500 rows)  |
+| `alpaca`  | Instruction-tuning with mid-training `infer()` on every checkpoint. | `yahma/alpaca-cleaned` (1000 rows) |
+| `chatml`  | Multi-turn chat fine-tuning over a real chat dataset.               | `stingning/ultrachat` (500 rows)   |
+
+Skip the prompt with `pnpm create arkor my-arkor-app --template alpaca`.
+
+## Why Arkor
+
+Custom open-weight models are a real option today because of years of work in the Python ML ecosystem and the people and companies who built it out. 
+Arkor stands on that foundation.
+
+What we wanted, and didn't find, was a path that fits how TypeScript and Node developers already work: a workflow where fine-tuning, evaluation, and serving live in the same codebase as the product, with the same editor, types, and review flow. 
+
+Type-safe configs instead of separate config files. Hot reload over your training code. A local Studio for the dev loop.
+
+The phrase we keep coming back to: **ship the model the same way you ship the product.** If that sounds right, you're the audience.
+
+## What works today
+
+- ✅ **Fine-tune an open-weight LLM from one file.** `createTrainer({ model, dataset, lora, ... })` runs LoRA training on the base model you point it at.
+- ✅ **Pull data from HuggingFace, or bring your own URL.** The `dataset` field accepts any HF name (with optional `split`) or a blob URL to a JSONL file.
+- ✅ **React to training in code, not in a dashboard.** Lifecycle callbacks (`onStarted`, `onLog`, `onCheckpoint`, `onCompleted`, `onFailed`) fire as the run streams from the cloud, fully typed.
+- ✅ **Sanity-check the model before the run finishes.** Inside `onCheckpoint`, call `infer({ messages })` against the model as it's being trained.
+- ✅ **Watch the run in a local Studio.** `arkor dev` opens a UI with a jobs list, live loss chart, log tail, and a Playground for chatting with your fine-tuned models.
+- ✅ **Try it without an account.** Anonymous workspace by default; run `arkor login` (Auth0 PKCE) to claim your work later.
+
+## What's coming next
+
+- ⏳ **Deploy a fine-tuned model as an inference endpoint** with `createDeploy(...)`.
+- ⏳ **Run evaluations on every checkpoint** with `createEval(...)`.
+- ⏳ **Bring your own datasets and base models.** CSV / JSONL uploads and custom HuggingFace base models.
+- ⏳ **Team and multi-org workspaces.**
+- ⏳ **Self-host the training backend.** Today we host it.
+
+## A taste of the API
+
+```ts
+// src/arkor/trainer.ts
+import { createTrainer } from "arkor";
+
+export const trainer = createTrainer({
+  name: "support-bot-v1",
+  model: "unsloth/gemma-4-E4B-it",
+  dataset: { type: "huggingface", name: "yahma/alpaca-cleaned", split: "train[:1000]" },
+  lora: { r: 16, alpha: 16 },
+  maxSteps: 100,
+  callbacks: {
+    onLog: ({ step, loss }) => console.log(`step=${step} loss=${loss}`),
+    onCheckpoint: async ({ step, infer }) => {
+      const res = await infer({ messages: [{ role: "user", content: "Hello!" }] });
+      console.log(`ckpt @ ${step}:`, await res.text());
+    },
+  },
+});
 ```
 
+```ts
+// src/arkor/index.ts  ← discovered by `arkor dev` / `arkor build`
+import { createArkor } from "arkor";
+import { trainer } from "./trainer";
+
+export const arkor = createArkor({ trainer });
+```
+
+`src/arkor/index.ts` is the file the CLI and Studio look for. 
+Your `trainer` lives in a sibling file and is registered through `createArkor`. `deploy` and `eval` will work the same way. 
+
+To add a new one, drop a file and register it; no scaffolder rerun needed.
+
+<!--
+  Studio screenshots go here once captured:
+    - assets/studio-jobs.png        Jobs list
+    - assets/studio-chart.png       Live loss + log tail
+    - assets/studio-playground.png  Playground chat
+-->
+
 ## What's in a project
 
 ```
-my-app/
+my-arkor-app/
 ├── src/arkor/
-│   ├── index.ts        # umbrella — `createArkor({ trainer })`
-│   └── trainer.ts      # `createTrainer({ name, model, dataset, ... })`
-├── arkor.config.ts     # training defaults
-├── .arkor/             # state + build artifact (gitignored)
-└── package.json
+│   ├── index.ts        # createArkor({ trainer })  ← discovered by the CLI / Studio
+│   └── trainer.ts      # createTrainer({ ... })
+├── arkor.config.ts
+├── .arkor/             # state + build artifacts (gitignored)
+└── package.json        # dev / build / start
 ```
 
-The umbrella is what the CLI and Studio discover. Per-role primitives —
-`trainer` today, `deploy` and `eval` later — live in sibling files and get
-gathered on `createArkor`. Adding a new primitive is "drop a file, register
-it on the umbrella": no scaffold change required.
-
 ## CLI
 
-| Command | Purpose |
-|---|---|
-| `arkor init` | Scaffold a new project in the current directory |
-| `arkor login` / `logout` / `whoami` | Auth0 PKCE / anonymous tokens |
-| `arkor dev` | Launch the local Studio (hot reload + GUI) |
-| `arkor build` | Bundle `src/arkor/index.ts` to `.arkor/build/index.mjs` |
-| `arkor start` | Run the build artifact (auto-builds when missing) |
+| Command                              | Purpose                                                                |
+| ------------------------------------ | ---------------------------------------------------------------------- |
+| `arkor init`                         | Scaffold a new project in the current directory                        |
+| `arkor login` / `logout` / `whoami`  | Auth0 PKCE / anonymous tokens                                          |
+| `arkor dev`                          | Launch the local Studio web UI (with hot reload)                       |
+| `arkor build`                        | Bundle `src/arkor/index.ts` to `.arkor/build/index.mjs`                |
+| `arkor start`                        | Run the build artifact (auto-builds when missing)                      |
+
+`pnpm dev` resolves to `arkor dev` in scaffolded projects, so most workflows live behind that one command.
+
+## Architecture
+
+`arkor dev` boots a [Hono](https://hono.dev) server on `127.0.0.1:4000` that hot-reloads your code and serves a Vite + React SPA from the same origin. 
+
+The SPA talks to your code via per-launch CSRF-token-gated `/api/*` routes (loopback-only, with a `Host` header guard against DNS rebinding); your code talks to the Arkor training backend over authenticated HTTPS. 
+
+Training runs on managed GPUs; checkpoints stream back as SSE events that fire your `callbacks.*` in process.
+
+## Repository
+
+| Package                                        | What it is                                  |
+| ---------------------------------------------- | ------------------------------------------- |
+| [`arkor`](packages/arkor)                      | SDK + CLI + bundled local Studio            |
+| [`create-arkor`](packages/create-arkor)        | `pnpm create arkor` scaffolder              |
+
+Requires Node.js 22.6+. 
+(Please use Node.js 24, preferably the latest version, for contributing to this repository.)
+
+Works with pnpm / npm / yarn / bun.
 
-`pnpm dev` resolves to `arkor dev` in scaffolded projects, so most workflows
-live behind that one command.
+## We're shipping in public
 
-## Packages
+Arkor is alpha, and the core idea (TypeScript-native fine-tuning for product engineers) is something we want to design *with* the people who'd use it. If that's you:
 
-| Package | What it is |
-|---|---|
-| [`arkor`](packages/arkor) | The SDK + CLI + bundled local Studio |
-| [`create-arkor`](packages/create-arkor) | `pnpm create arkor` scaffolder |
+- **[File an issue](https://github.com/arkorlab/arkor/issues/new)** with the model + dataset + workflow you wish worked. We read everything.
+- **Star the repo** if you want updates as we move toward `0.1`.
+- **[Join Discord](https://discord.gg/YujCZYGrEZ)** for live discussion and early-access pings.
 
-## Requirements
+We're especially curious about: which open-weight base models you'd reach for first, what you'd want from `createDeploy` / `createEval`, and what breaks when you try the alpha.
 
-- Node.js 22.6+ (the SDK relies on stable APIs from that line)
-- pnpm / npm / yarn / bun all work for installs
+See [CONTRIBUTING.md](CONTRIBUTING.md) for development setup.
 
 ## License
 
-MIT — see [LICENSE.md](LICENSE.md).
+[MIT](LICENSE.md). 
diff --git a/assets/logo-dark.svg b/assets/logo-dark.svg
diff --git a/assets/logo.svg b/assets/logo.svg