Turn web pages, chats, code reviews, and HTML artifacts into low-token workbooks that agents can read by range and humans can archive forever.
把网页、聊天记录、代码上下文和 HTML Artifact 变成低 Token、可切片、可归档的 AI 上下文工作簿。
Interactive Demo · Launch Kit · 中文使用手册 · GitHub interactivity note
Star this project if you care about agent context engineering: not bigger prompts, but better context containers.
如果你关心 AI Agent 的上下文工程,这个项目值得收藏:重点不是写更长的提示词,而是把信息做成可索引、可切片、可迁移的上下文容器。
- Lower token cost: deterministic fixtures show
988avg tokens for CWB small-read vs1779for full Markdown and2323for raw HTML source. - Less information loss: required facts are preserved through manifest + CSV range reads.
- Human-friendly archive: every bundle keeps
workbook.xlsx,artifact.html, notes, CSV sheets, and assets. - Agent-friendly protocol: skills tell Claude Code, Cursor, Codex, and other agents to read the manifest first, then only relevant ranges.
- Open benchmark: every chart and report in this repo can be regenerated locally.
Pomelo Context Workbook(cwb)是一个面向 AI Agent 和人的 低 Token 上下文工作簿协议。它把网页、聊天记录、Markdown 文档、表格、图片、代码上下文和可交互 HTML artifact 转成一个 .cwb bundle,让 AI 先读索引,再按 sheet/range 精准读取信息;人类则可以直接打开 Excel、HTML 或 Markdown 版本长期归档。
它的目标不是替代 HTML 或 Markdown,而是补上二者在 Agent 场景里的缺口:省 token、少丢信息、可索引、可切片、可迁移、可视化。第一轮确定性实验显示,在 4 个场景、每个场景 5 个必要事实的样例中,cwb-small 以 988 平均 token 达到 100% 事实覆盖准确率,相比完整 Markdown 少 44.5% token,相比原始 HTML source 少 57.5% token。
Pomelo Context Workbook (cwb) is a low-token context workbook protocol for AI agents and humans. It turns long web pages, chat logs, Markdown notes, CSV/XLSX-like tables, images, code folders, and interactive HTML artifacts into a .cwb bundle:
manifest.jsonfor agent-first indexing and token budgeting.workbook.xlsxfor human reading and durable sharing.sheets/*.csvfor cheap range reads.notes/*.mdfor summaries and prompt recipes.assets/for optional HTML snapshots and source artifacts.
The point is not to replace HTML or Markdown. HTML is great for rich reading and interaction; Markdown is excellent for linear writing. Pomelo adds indexing, slicing, migration, and agent-friendly selective reads so a model can continue from a large artifact without reading the whole artifact.
English: AI context engineering, token optimization, agent memory, context workbook, HTML artifacts, Markdown alternative, XLSX archive, Claude Code skill, Cursor rules, web archive, chat migration, code context management.
中文: AI 上下文工程、Token 节省、Agent 记忆、上下文工作簿、HTML Artifact、Markdown 替代方案、Excel 归档、Claude Code Skill、Cursor 规则、网页归档、聊天记录迁移、代码上下文管理。
The idea is inspired by Thariq Shihipar's HTML artifact examples. See ACKNOWLEDGEMENTS.md for the full source attribution and references.
For a longer bilingual introduction, see docs/introduction.zh-en.md.
git clone https://github.com/guanxiaol/pomelo-context.git
cd pomelo-context
npm test
npm run cwb -- pack examples/web-archive.md --recipe web-or-chat-archive --out tmp/demo.cwb
npm run cwb -- inspect tmp/demo.cwb --index --budget small
npm run cwb -- read tmp/demo.cwb --sheet Sections --range A1:E8What you get:
- A
.cwbbundle withmanifest.json,workbook.xlsx, CSV sheets, notes, and assets. - A low-token index that tells agents what to read next.
- A human-readable workbook for long-term sharing and review.
npm run cwb -- pack examples/web-archive.md --recipe web-or-chat-archive --out tmp/demo.cwb
npm run cwb -- inspect tmp/demo.cwb --index --budget small
npm run cwb -- read tmp/demo.cwb --sheet Sections --range A1:E8
npm run cwb -- validate tmp/demo.cwb
npm run cwb -- benchmark tmp/demo.cwb
npm run cwb -- convert tmp/demo.cwb --to html --out tmp/demo.html
npm run experimentThis repository intentionally has no runtime dependencies. Node 22.18+ can run the TypeScript entrypoint directly.
Tiny inputs may benchmark worse than raw text because the bundle has a fixed index cost. The protocol is designed for long pages, chats, PRs, and code contexts:
npm run cwb -- benchmark . --recipe module-mapThat command estimates reading this repository directly versus reading a workbook index and selected ranges.
It uses prompts, but it is not only prompt engineering.
Pomelo is a file protocol + CLI + skill workflow:
| Layer | Role |
|---|---|
.cwb protocol |
Stores context as manifest, workbook, CSV ranges, notes, and assets. |
| CLI | Packs, inspects, reads, validates, converts, and benchmarks bundles. |
| Skills / rules | Teach agents to read selectively instead of loading everything. |
| Prompt templates | Help agents choose the right sheet/range for a task. |
中文理解:提示词只是调用方式之一,真正核心是 .cwb 文件协议和“先索引、后按需读取”的 Skill 工作流。
| Use case | What Pomelo preserves | Why it helps agents |
|---|---|---|
| Web / X archive | text, images, tables, code, links, source URL | Avoids rereading noisy web pages. |
| AI chat migration | decisions, tasks, summaries, original snippets | Lets a new agent resume a long thread. |
| Code context | modules, key files, risks, call paths | Keeps coding context cheap and navigable. |
| PR review | severity, annotated diff, affected files, test gaps | Turns review evidence into reusable context. |
| Design systems | tokens, components, variants, states | Makes visual/design context queryable. |
| Research reports | claims, sources, metrics, open questions | Separates durable facts from prose. |
Want to introduce Pomelo to others? Use the bilingual launch copy in docs/launch-kit.zh-en.md.
README files are best for static assets: SVG, PNG, badges, tables, Mermaid, and collapsible details. Real JavaScript interaction belongs on GitHub Pages.
Pomelo includes a GitHub Pages demo in site/index.html:
https://guanxiaol.github.io/pomelo-context/
See docs/github-interactivity.zh.md for the README vs Pages split.
Pomelo needs real-world fixtures more than abstract ideas. The most valuable contributions are long webpages, chat exports, PR reviews, code contexts, and design/research artifacts that can become reproducible benchmarks.
- Contribution guide:
CONTRIBUTING.md - Use-case issues:
.github/ISSUE_TEMPLATE/use_case.yml - Benchmark fixtures:
.github/ISSUE_TEMPLATE/benchmark_fixture.yml - Maintainer growth notes:
docs/growth-playbook.zh.md
compare-optionsimplementation-planannotated-pr-reviewmodule-mapdesign-system-referenceprototype-snapshotresearch-explainerstatus-reportincident-reportcustom-editor-exportweb-or-chat-archive
The .xlsx file is not magic compression. The savings come from the protocol:
- Agents read
manifest.jsonfirst. - They inspect sheet names, row counts, summaries, and suggested ranges.
- They pull only the relevant CSV ranges.
- They use
workbook.xlsxorartifact.htmlfor human review, not as the default model input.
See skills/context-workbook/SKILL.md for a cross-agent reading workflow. The skill tells agents to inspect the manifest first, then read only sheet ranges that match the task.
Context Workbook extracts webpage text, photos/images, tables, code blocks, and links into separate sheets. See docs/compatibility.md.
The file examples/html-effectiveness-catalog.json maps the 20 HTML Effectiveness demos to Context Workbook recipes and expected sheets. See docs/html-effectiveness-mapping.md.
Run:
npm run experimentCurrent deterministic fixture results:
| Method | Avg Tokens | Fact Accuracy | Structure Score | Accuracy per 1K Tokens |
|---|---|---|---|---|
| Markdown full read | 1779 | 100.0% | 50.0 | 0.562 |
| HTML source read | 2323 | 100.0% | 90.0 | 0.430 |
| CWB small read | 988 | 100.0% | 100.0 | 1.013 |
Outputs are written to experiments/results/, including experiment-report.html, experiment-report.md, experiment-summary.csv, and experiment-results.json.
See docs/experiment-methodology.zh.md for the metric definitions and limitations.
See docs/context-workbook.zh.md for a Chinese article draft introducing the idea.
See docs/usage-manual.zh.md for Claude Code and Cursor deployment instructions.
See docs/roadmap.md.
This project is inspired by Thariq Shihipar's Using Claude Code: The Unreasonable Effectiveness of HTML, its companion examples page, and the html-effectiveness repository. Context Workbook is independent and not affiliated with those projects or organizations.

