Personal Context Layer (PCL)

Clean context in. Better answers out.

An open framework for giving any AI durable, structured, honest context about you — the layer for what your AI knows, not how it talks. Part of the PersonaSync family.

The problem this exists to solve

Every AI you use is a stranger with amnesia. You re-explain your whole life story every session just to get a useful answer, and none of it carries over to the next one. An AI is only as good as what you feed it.

There's a second, quieter problem: it has nothing real to measure your progress against, so when you ask how you're doing, it flatters you instead. "Great work!" is not a baseline, a target, and a delta. Honest feedback needs something to compare against.

PCL separates the two jobs that solve this:

Context — a structured, portable record of what you're working toward, that travels with you so the conversation just starts.
Honest measurement — scoring that compares where you are against the baselines and goals you set, and reports how sure it is instead of faking certainty.

This repo ships the first job (extraction into structured context) as a working tool, and specifies the second (honest scoring) as the next build.

What this is

PCL is two things in one repository:

A framework — a typed schema for the things you're working toward (called axes), an LLM extraction prompt that turns plain-language goals into that schema, an evaluation of how well it does so, a working MCP read tool that serves that context to any AI, and a design spec for the scoring layer.
A web tool — a no-account interface where you paste your goals the way you'd actually say them, and Claude returns them as structured, confidence-aware axes you confirm before anything is saved.

The framework is the moat; the web tool is one surface of it — the extraction layer, the first slice that went live. What's agentic about PCL, and what's still in progress, are covered in the sections that follow.

Built for AI 502 (Generative AI) at Grand Valley State University, summer 2026. The architecture and evaluation methodology are designed to extend beyond the course timeline.

Agentic direction

PCL is built to be the grounding layer for an agentic system. The target is an MCP server that exposes a person's declared context as tools any AI agent can call — so the agent pulls what it needs when a task needs it, and acts on real stated goals instead of guessing from behavior. Pull, not push: nothing auto-acts or nags; the agent fetches context on request.

What's live in this repo is the full grounding path — extraction turns free text into typed, confirmed context, and an MCP server now exposes that context to outside agents through a single read tool, describe_axes. An outside client (Claude Desktop) calls it and pulls the stored axes on demand, no pasting. The server serves declared context only: the scoring engine is still in progress, so an agent can read your context but not yet a score of it. That measurement layer is the next build.

How it works

The diagram shows what's live (extraction, and the MCP read surface that serves the stored context) alongside the scoring layer still in progress.

┌──────────────────────┐     ┌───────────────────────────┐
│  Free text           │  →  │  Extraction layer  (LIVE)  │
│  your goals/habits,  │     │  Claude +                  │
│  in your own words   │     │  extraction_prompt.txt     │
└──────────────────────┘     └───────────────────────────┘
                                          │
                                          ▼
                          ┌───────────────────────────────┐
                          │  Structured axes               │
                          │  typed · baseline · target ·   │
                          │  cadence · confidence flag     │
                          │  tagged as a draft you confirm │
                          └───────────────────────────────┘
                                          │
                       ┌──────────────────┴───────────────────┐
                       ▼                                       ▼
        ┌───────────────────────────┐         ┌───────────────────────────┐
        │  Valuation engine          │         │  MCP context layer         │
        │  honest scoring against    │         │  any AI pulls your context │
        │  your own baselines        │         │  via describe_axes         │
        │  (IN PROGRESS)             │         │  (LIVE — read tool)        │
        └───────────────────────────┘         └───────────────────────────┘

Single source of truth. You state your goals once, in plain language. The axis schema is the contract every later layer reads from — the scoring engine scores axes, the MCP layer serves axes — so the thing you type is the thing that travels.

No invented precision. The extraction never makes up detail it wasn't given. Thin data and vague goals come back flagged, not faked, and every axis is a draft you confirm before it counts.

Quick start

Open the live tool: the Hugging Face Space
Paste your goals and habits into the box, however you'd actually say them (or click one of the examples)
Click Structure my context
Read the structured axes that come back — typed, with baselines/targets/cadences where you stated them, and a confidence flag where the model is unsure

No account, no login — the hosted demo stores nothing.

To run it locally:

git clone https://github.com/artiebowman/personasync-pcl.git
cd personasync-pcl
pip install -r requirements.txt
export ANTHROPIC_API_KEY=sk-ant-...   # your Anthropic API key
python app.py

Run locally and the Save button is live (hidden on the hosted demo, which has nowhere to write): confirm the extracted axes and they're written to a local SQLite store at local/pcl.db.

Connect it to Claude Desktop — the full loop

The agentic step — an outside AI calling your context. The server needs Python 3.12+ (the mcp SDK requires ≥3.10), so it runs in its own environment, separate from the Gradio app:

uv venv .venv-mcp --python 3.12
uv pip install --python .venv-mcp -r requirements-mcp.txt

Register it with Claude Desktop — edit ~/Library/Application Support/Claude/claude_desktop_config.json and add a pcl entry under mcpServers, using absolute paths to .venv-mcp/bin/python, mcp_server.py, and PCL_DB_PATH=local/pcl.db. Fully quit Claude Desktop (Cmd+Q) and reopen, then ask: "Use the pcl tool to describe my axes." It calls describe_axes, pulls your saved axes, and answers from them — one read tool, declared context only; scoring's still in progress (see the Build log).

Image: The pcl server registered as a connector in Claude Desktop.

Repository structure

personasync-pcl/
├── app.py                    ← the live extraction demo (Gradio) + local Save
├── extract_core.py           ← shared extraction call (app + eval)
├── save_core.py              ← shared save transforms (CLI + UI)
├── save_axes.py              ← CLI: confirm + save extracted axes locally
├── run_eval.py               ← runs the extraction eval for real
├── mcp_server.py             ← MCP server: the describe_axes read tool (stdio)
├── schema.sql                ← SQLite schema for the local context store
├── db.py                     ← storage layer over the local store
├── extraction_prompt.txt     ← the system prompt that drives extraction
├── sample-extraction.json    ← example extraction output (save-path fixture)
├── requirements.txt          ← pinned dependencies (Gradio 4.44.1 + Anthropic SDK)
├── requirements-mcp.txt      ← MCP server deps (Python 3.12+)
├── test_*.py                 ← storage + MCP stdio round-trip tests
├── prd.md                    ← product requirements + honesty constraints
├── domain-primer.md          ← the axis model and why it's shaped this way
├── architecture.md           ← extraction, scoring, and MCP-layer design
├── evaluation.md             ← eval protocol + extraction test cases
├── extraction-eval.md        ← live extraction-layer eval (qualitative)
├── future-work.md            ← scoped roadmap beyond the course
├── feedback-log.md           ← running log of design decisions + feedback
├── claude.md                 ← project guide for AI collaborators
├── source/                   ← early ideation notes
│   ├── personasync-idea.md
│   └── project-arc-outline.md
├── LICENSE                   ← MIT
└── README.md                 ← this file

Build log

A short narrative of the design decisions behind this version.

The demonstrable slice: the extraction layer

PCL's full arc is extraction → honest scoring → a context layer any AI can query. The scoring engine is the conceptual core, but it's also the largest build, and a half-working scorer demonstrates nothing. The extraction layer, by contrast, is both genuinely useful on its own and the input contract everything downstream depends on — so it was chosen as the slice to ship and evaluate first. Getting the schema and the honesty behavior right here de-risks every later layer.

The agentic layer: a context server any AI can call

The formative draft was one prompt, one response — structured extraction, nothing more. The final's job was to make that context callable: a tool an AI actually invokes, not a roadmap promise. So this stage built the layer the whole framework points at.

Three pieces. A local SQLite store holds confirmed axes (the §1.2 schema). A confirm-before-save step — a CLI and a Save button in the local app — turns extraction's draft into stored, user-authored context, filling required blanks and resolving anything left undetermined rather than guessing. And an MCP server exposes that store over stdio through a single read tool, describe_axes, behind an omit gate applied before anything leaves.

The pattern is pull, not push: an outside client (Claude Desktop) calls describe_axes and pulls the declared context on demand — the agent lives in the consumer, not in this app. A cross-client query from Claude Desktop, returning the stored axes with no pasting, is the working demonstration.

Image: Claude Desktop calls describe_axes and answers from the saved axes — no pasting.

Honest scope holds here too: the server exposes one read tool, declared context only. The scoring engine is still in progress, so it pulls your context — it doesn't yet measure it. get_scores and get_profile are deliberately not exposed, since neither can be served honestly without the scorer. Privacy stays a design stance: the store is local and gitignored, the omit gate is structurally present, but user-facing privacy controls aren't fully wired.

Prompt engineering: v1 → v2

The extraction prompt was iterated against a small set of hand-written cases covering everyday habits, deliberately thin data, and multi-year plans. The v1 prompt structured clear goals well but two behaviors needed correcting:

It occasionally manufactured a baseline or target the user never stated, rather than leaving it unknown. v2 made the "flag, don't fake" rule explicit and tied confidence to whether the value was stated, inferred, or missing.
It mislabeled some maintenance goals as improvement goals — e.g. "keep my coffee to 1–2 cups" read as a reduction target rather than a band to hold. This is logged as a known limitation, not yet fully fixed: mode-from-intent on band-style goals is the first item in the next build.

Honest by construction

The hardest constraint in this project was resisting the urge to look more finished than it is. Three rules held throughout:

The confidence shown in the demo is illustrative — for testing and demonstration, not the output of a real scoring engine.
The scoring engine is in progress, not built. The repo says so everywhere, including in the live tool's roadmap.
Privacy is a design stance (local-first, omit-by-design), not yet enforced by code — so it's described as intent, never claimed as a shipped guarantee.

Deployment: three failures worth keeping

Getting the tool live on Hugging Face Spaces surfaced three real, instructive breakages, each fixed by pinning the environment back to what the app was verified against:

audioop removed in Python 3.13. The Space defaulted to 3.13; Gradio's pydub dependency imports the audioop stdlib module, which 3.13 dropped. Fixed by pinning python_version: "3.11".
Starlette/FastAPI too new for Gradio 4.44.1. Newer Starlette changed the template-response signature, so Gradio passed arguments in the wrong order and every page load threw unhashable type: 'dict'. Fixed by pinning fastapi==0.112.2 and starlette==0.38.6.
An invisible character in the API key. A U+2028 line-separator hitched a ride when the key was pasted into the Space secret, breaking the ASCII-only HTTP header (UnicodeEncodeError). Fixed by stripping the key on read, and by surfacing real error messages in the UI instead of a blank "Error".

Iterating on the draft's feedback

Review of the draft was direct: deployment and documentation held up, but nothing agentic ran — one prompt and one response, with the MCP server and scoring engine still on paper. The response wasn't a token tool bolted onto the app; it was to build the MCP server itself — the load-bearing piece the rest of the framework, and the product it's headed toward, depend on.

The eval moved the same way — from described to run. run_eval.py now makes a real extraction call per documented case and checks the behaviors structurally. Running it for real earned its keep immediately: the thin-data money goal turned out to be handled two honest ways across runs — left out entirely, or included with a null, flagged target — but never with an invented number. The eval was rewritten to test the behavior the project actually promises (no fabricated figure) rather than one brittle outcome, and extraction-eval.md now owns both variants.

Deferred for time

Documented as future work, not cut from scope:

The valuation engine — the statistically honest scorer the whole framework points at.
The evaluation harness — the planted-signals scorer-validation harness specified in evaluation.md, plus the synthetic-data generator it runs against (specified, not yet built).
All axis types + maintenance mode — including the mode-from-intent fix for band-style goals.
The rest of the context layer — describe_axes ships now (see Build log); further tools, and anything that serves a score, wait on the valuation engine.
Light templates — optional pre-filled starting points to reduce the blank-page cost.

Evaluation methodology

Two layers, each evaluated at the stage it's built.

Live extraction layer — measured. "Good" is defined and tested in extraction-eval.md: a four-criterion qualitative review (correct typing, stated-only values, flag-don't-fake, correct mode) run against the tool's built-in example inputs and hand-scored pass/partial/fail. It records what works and the one known weakness — maintenance-vs-improvement mode on band-style goals.

Scoring engine — designed, not yet run. evaluation.md specifies the planted-signals harness that will validate the in-progress scorer: synthetic data with known properties paired to pre-asserted outputs, across a per-axis-type coverage matrix. The harness and its data generator are build work ahead; evaluation.md is the contract they'll be built against.

Sibling project: PersonaSync

PCL has a sibling, PersonaSync (repo), built for the same course.

PersonaSync personalizes how the AI talks to you — voice, style, anti-patterns.
PCL personalizes what the AI knows about you — goals, baselines, progress.

They're designed to compose: PersonaSync's prompt-assembly contract reserves an optional context block, and PCL's MCP layer is what would fill it at runtime. One handles voice, one handles context, under one roof.

Looking ahead — Project 3. PersonaSync (Project 1) and PCL (Project 2) are two halves of one idea. Project 3 ships them as a single product: a web app plus Chrome extension that carries both your voice and your context into whatever AI you're using — no copy-paste, no re-explaining. P1 gave your AI a voice; P2 gives it context; P3 is both, everywhere you work.

License & attribution

MIT License. See LICENSE for full text.

Built by Artie Bowman for AI 502 (Generative AI) at Grand Valley State University, summer 2026. Instructor: Zach DeBruine.

Forks and issues welcome — the framework is built to be extended.

Roadmap

Live now: Free text → structured, confidence-aware context · an MCP read tool (describe_axes) any AI can call, demoed cross-client from Claude Desktop
In progress: The valuation engine — honest scoring against your own baselines
Future state: The rest of the context layer (more tools, score-serving) · light templates · joining PersonaSync's voice layer
Research: testing whether declared, structured goals beat inferred ones as a context signal — explored as a separate track (see future-work.md).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Personal Context Layer (PCL)

The problem this exists to solve

What this is

Agentic direction

How it works

Quick start

Connect it to Claude Desktop — the full loop

Repository structure

Build log

The demonstrable slice: the extraction layer

The agentic layer: a context server any AI can call

Prompt engineering: v1 → v2

Honest by construction

Deployment: three failures worth keeping

Iterating on the draft's feedback

Deferred for time

Evaluation methodology

Sibling project: PersonaSync

License & attribution

Roadmap

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 97 Commits
images		images
scoring		scoring
source		source
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
architecture.md		architecture.md
claude.md		claude.md
context-measurement-spectrum.md		context-measurement-spectrum.md
db.py		db.py
domain-primer.md		domain-primer.md
evaluation.md		evaluation.md
extract_core.py		extract_core.py
extraction-eval.md		extraction-eval.md
extraction_prompt.txt		extraction_prompt.txt
feedback-log.md		feedback-log.md
future-work.md		future-work.md
mcp_server.py		mcp_server.py
prd.md		prd.md
requirements-mcp.txt		requirements-mcp.txt
requirements.txt		requirements.txt
run_eval.py		run_eval.py
run_scorer_eval.py		run_scorer_eval.py
sample-extraction.json		sample-extraction.json
save_axes.py		save_axes.py
save_core.py		save_core.py
schema.sql		schema.sql
test_guided_flow.py		test_guided_flow.py
test_mcp_server.py		test_mcp_server.py
test_storage.py		test_storage.py

Folders and files

Latest commit

History

Repository files navigation

Personal Context Layer (PCL)

The problem this exists to solve

What this is

Agentic direction

How it works

Quick start

Connect it to Claude Desktop — the full loop

Repository structure

Build log

The demonstrable slice: the extraction layer

The agentic layer: a context server any AI can call

Prompt engineering: v1 → v2

Honest by construction

Deployment: three failures worth keeping

Iterating on the draft's feedback

Deferred for time

Evaluation methodology

Sibling project: PersonaSync

License & attribution

Roadmap

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages