Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 10 additions & 6 deletions AGENT_GUIDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -149,16 +149,18 @@ Find documents by metadata criteria without a text search query.

| Parameter | Required | Description |
|-----------|----------|-------------|
| `metadata_filter` | Yes | JSON key-value pairs (AND semantics). Example: `{"type": "decision-log"}` |
| `project_name` | No | Restrict to a project. |
| `metadata_filter` | No† | JSON key-value pairs (AND semantics). Example: `{"type": "decision-log"}` |
| `project_name` | No | Restrict to a project. Sufficient on its own to **list that project's documents**. |
| `include_content` | No | Include full text (default false). |
| `limit` | No | Max results (default 10). |
| `updated_since` | No | ISO-8601 timestamp. Only docs updated on/after. |
| `created_since` | No | ISO-8601 timestamp. Only docs created on/after. |
| `updated_since` | No | ISO-8601 timestamp. Only docs updated on/after. |
| `created_since` | No | ISO-8601 timestamp. Only docs created on/after. |
| `max_bytes` | No | Response size budget when include_content is true. |
| `requestor` | No | Your agent name. |

Use for browsing by category, catching up on recent changes (`updated_since`), or finding all documents of a specific type.
† **At least one** of `metadata_filter`, `project_name`, `updated_since`, or `created_since` must be supplied (so this never becomes an unbounded whole-KB dump). An empty `metadata_filter` plus `project_name` lists that project's documents.

Use for browsing by category, catching up on recent changes (`updated_since`), listing all documents in a project (`project_name` alone), or finding all documents of a specific type. Results are ordered newest-updated first.

---

Expand Down Expand Up @@ -240,14 +242,15 @@ These two tools have **different contracts**. Picking the wrong one is the most
| Top-N ranked hits are enough to answer | You need a complete, exhaustive set (e.g. an inventory or a catch-up) |

- **`cerefox_search` is relevance-ranked top-N.** It returns the best `match_count` matches (**default 5** — raise it via `match_count`). It is **not** an enumeration tool: if more docs match than `match_count`, the rest sit silently below the cutoff — and the one you most want (e.g. the *newest*) may be exactly the one dropped.
- **`cerefox_metadata_search` is exhaustive enumeration by criteria.** No text query. Filters by `metadata_filter` (plus `project_name`, `updated_since` / `created_since`). It returns **metadata only by default** (`include_content=false`) — ids + titles + tags, which is cheap — so raise `limit` (**default 10**) freely to get the whole set. Discover available keys with `cerefox_list_metadata_keys`.
- **`cerefox_metadata_search` is exhaustive enumeration by criteria.** No text query. Filters by `metadata_filter`, `project_name`, `updated_since` / `created_since` — supply **at least one** (an empty `metadata_filter` plus `project_name` lists that project's documents). It returns **metadata only by default** (`include_content=false`) — ids + titles + tags, which is cheap — so raise `limit` (**default 10**) freely to get the whole set. Discover available keys with `cerefox_list_metadata_keys`.

### Examples

- *"Find our OAuth design notes"* (relevance) → `cerefox_search(query="OAuth design", match_count=5)`
- *"List every decision-log doc"* (enumeration) → `cerefox_metadata_search(metadata_filter={"type":"decision-log"}, limit=50, include_content=false)`
- *"What changed since I last looked?"* → `cerefox_metadata_search(metadata_filter={"type":"decision-log"}, updated_since="2026-05-01T00:00:00Z")`
- *"Just the ids of all active research docs"* → `cerefox_metadata_search(metadata_filter={"type":"research","status":"active"}, limit=100)`
- *"List everything in the Cerefox project"* → `cerefox_metadata_search(project_name="Cerefox", limit=100)` (no `metadata_filter` needed)

### Pattern: finding the newest item in a growing series

Expand Down Expand Up @@ -420,6 +423,7 @@ The legacy Python `uv run cerefox` is a frozen husk as of v0.9 — only `uv run
| `cerefox_get_document(document_id, version_id, requestor)` | `cerefox document get <document-id> --version-id <vid> --requestor <name>` |
| `cerefox_list_versions(document_id, requestor)` | `cerefox document version list <document-id> --requestor <name>` |
| `cerefox_list_projects(requestor)` | `cerefox project list --requestor <name>` |
| `cerefox_set_document_projects(document_id, project_names, author)` | `cerefox document set-projects <document-id> <name...> --author <a> --author-type user\|agent` (or `--clear` to remove all) |
| `cerefox_list_metadata_keys()` | `cerefox metadata keys` |
| `cerefox_metadata_search(metadata_filter, project_name, updated_since, created_since, limit, include_content, requestor)` | `cerefox metadata search --metadata-filter '<json>' --project-name <n> --updated-since <iso> --created-since <iso> --limit N --include-content --requestor <name>` |
| `cerefox_get_audit_log(document_id, author, operation, since, until, limit, requestor)` | `cerefox audit list --document-id <id> --author <a> --operation <op> --since <iso> --until <iso> --limit N --json --requestor <name>` |
Expand Down
6 changes: 3 additions & 3 deletions AGENT_QUICK_REFERENCE.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ Cerefox is a persistent, shared knowledge base. You have **10 MCP tools** (9 of
| `cerefox_ingest` | Save or update a document | `title`, `content` (required), `document_id` (update by ID), `update_if_exists`, `project_name` (single, non-destructive add on update), `project_names` (list, destructive replace on update), `metadata`, `author` |
| `cerefox_get_document` | Get full document by ID | `document_id` (required) |
| `cerefox_list_versions` | Version history of a document | `document_id` (required) |
| `cerefox_metadata_search` | Find docs by metadata (no text query) | `metadata_filter` (required), `include_content`, `updated_since` |
| `cerefox_metadata_search` | Find or list docs by metadata, project, or time (no text query) | `metadata_filter`, `project_name` (list a project's docs), `updated_since`, `include_content` — **at least one** of metadata_filter/project_name/updated_since/created_since |
| `cerefox_list_metadata_keys` | Discover available metadata keys | (none required) |
| `cerefox_list_projects` | List all projects | (none required) |
| `cerefox_set_document_projects` | Set doc's project memberships to exactly the given list (destructive replace; metadata-only, no content change) | `document_id`, `project_names` (required) |
Expand Down Expand Up @@ -64,8 +64,8 @@ Same operations, same conventions. Full reference: [`docs/guides/cli.md`](docs/g
| `cerefox_list_versions` | `cerefox document version list <id> --requestor "<your-name>"` |
| `cerefox_list_projects` | `cerefox project list --requestor "<your-name>"` |
| `cerefox_list_metadata_keys` | `cerefox metadata keys` |
| `cerefox_metadata_search` | `cerefox metadata search --metadata-filter '<json>' --requestor "<your-name>"` |
| `cerefox_set_document_projects` | _MCP-only; a CLI command will be added in a future release. Until then, run via MCP if available._ |
| `cerefox_metadata_search` | `cerefox metadata search --metadata-filter '<json>' --requestor "<your-name>"` (list a project: `cerefox document list --project <name>`) |
| `cerefox_set_document_projects` | `cerefox document set-projects <id> <name...> --author "<your-name>" --author-type agent` (or `--clear` to remove all) |
| `cerefox_get_audit_log` | `cerefox audit list --requestor "<your-name>"` (add `--json` for scripted access) |
| `cerefox_get_help` | `cerefox guides show agent-quick-reference` (or `cerefox guides list` for the full bundled-docs index) |

Expand Down
21 changes: 20 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,26 @@ Versioning: [Semantic Versioning](https://semver.org/spec/v2.0.0.html) — all `

## [Unreleased]

Open roadmap.
### Added

- **`cerefox_metadata_search` can now list a project's documents** — closing a CLI↔MCP
parity gap (the CLI's `cerefox document list --project <name>` had no MCP equivalent).
`metadata_filter` is now **optional**: supply `project_name` (and/or `updated_since` /
`created_since`) alone to list documents by scope, ordered newest-updated first. At
least one of `metadata_filter` / `project_name` / `updated_since` / `created_since` is
still required, so the tool never becomes an unbounded whole-KB dump. Backward
compatible — existing non-empty-filter callers are unaffected. The twin
`cerefox-metadata-search` Edge Function and the GPT Actions OpenAPI block
(`info.version` → 1.9.0) were relaxed in lockstep. A new **CLI ↔ MCP parity matrix** in
[`docs/guides/cli.md`](docs/guides/cli.md) documents the full surface and the remaining
(intentional vs. actionable) gaps.
- **`cerefox document set-projects <document-id> [names…]`** — new CLI command
closing the reverse parity gap (the `cerefox_set_document_projects` MCP tool
had no CLI form). Full-set replace of a document's project memberships
(`--clear` removes all); created-if-missing, case-insensitively de-duplicated,
logged as an `update-metadata` audit entry. Shares the membership-replace core
with the MCP tool (`_shared/mcp-tools/_projects.ts → replaceDocumentProjects`)
so both behave identically.

---

Expand Down
2 changes: 1 addition & 1 deletion CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ cerefox/
│ └── memory/ # @cerefox/memory npm package — both bins (v0.5+)
│ ├── src/
│ │ ├── bin/cerefox.ts # single bin (v0.5.1+); commander dispatch + error handler
│ │ ├── cli/ # commander program + 28 subcommand files
│ │ ├── cli/ # commander program + 35 subcommand files
│ │ │ ├── commands/ # one file per subcommand (including `mcp` which runs buildServer())
│ │ │ └── util/ # checks, mcp-config-writers, bundled-docs
│ │ ├── server.ts # buildServer() factory (called by the `mcp` subcommand)
Expand Down
2 changes: 1 addition & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -186,7 +186,7 @@ packages/memory/
bin/cerefox.ts the package's bin — top-level error handler + commander dispatch
cli/
program.ts commander program assembly; one registerXyz() per subcommand
commands/ 28 subcommand files (including `mcp` which runs buildServer())
commands/ 35 subcommand files (including `mcp` which runs buildServer())
util/ checks (doctor/status), mcp-config-writers, bundled-docs, client, embed
test/
stdio-smoke.test.ts spawn `cerefox mcp` and walk an MCP handshake
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ See [Connecting AI agents](#connecting-ai-agents) for the how-to per client.
| **Review status** | Schema-level `review_status` on documents (`approved` / `pending_review`). Auto-transitions based on author_type. Filterable on search |
| **Version governance** | Version archival (protect specific versions from cleanup), configurable retention (`CEREFOX_VERSION_CLEANUP_ENABLED`), version diff viewer |
| **Usage tracking** | Opt-in logging of all operations (reads and writes) across all access paths. Tracks operation type, access path (remote-mcp, local-mcp, edge-function, webapp, cli), requestor identity, query text, and result count. Controlled via `cerefox config set usage_tracking_enabled true/false` -- no redeploy needed |
| **Analytics dashboard** | `/app/analytics` -- 7 interactive charts: calls per day, access path breakdown, top documents, top readers, operations donut, reader word cloud, and reader-to-document access pattern visualization (HEB). Date range + project + path filters. CSV export. |
| **Analytics dashboard** | `/app/analytics` -- 8 interactive charts: calls per day, access path breakdown, top documents, top readers, operations donut, requestor word cloud, requestor→document access patterns (HEB), and requestor→operation patterns (HEB). Date range + project + path filters. CSV export. |

---

Expand Down
49 changes: 47 additions & 2 deletions _shared/__tests__/mcp_tools.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -181,7 +181,7 @@ describe("input validation throws McpInvalidParams", () => {
expect(err).toBeInstanceOf(McpInvalidParams);
});

test("cerefox_metadata_search rejects missing metadata_filter", async () => {
test("cerefox_metadata_search rejects no criteria at all (no filter, no scope)", async () => {
const tool = TOOLS_BY_NAME["cerefox_metadata_search"];
let err: unknown;
try {
Expand All @@ -192,7 +192,7 @@ describe("input validation throws McpInvalidParams", () => {
expect(err).toBeInstanceOf(McpInvalidParams);
});

test("cerefox_metadata_search rejects empty metadata_filter", async () => {
test("cerefox_metadata_search rejects empty metadata_filter with no other scope", async () => {
const tool = TOOLS_BY_NAME["cerefox_metadata_search"];
let err: unknown;
try {
Expand All @@ -203,6 +203,16 @@ describe("input validation throws McpInvalidParams", () => {
expect(err).toBeInstanceOf(McpInvalidParams);
});

test("cerefox_metadata_search rejects a non-object metadata_filter", async () => {
const tool = TOOLS_BY_NAME["cerefox_metadata_search"];
let err: unknown;
try {
await tool.handler(noopClient(), { metadata_filter: "nope" }, FAKE_CTX);
} catch (e) {
err = e;
}
expect(err).toBeInstanceOf(McpInvalidParams);
});
test("cerefox_set_document_projects rejects missing document_id", async () => {
const tool = TOOLS_BY_NAME["cerefox_set_document_projects"];
let err: unknown;
Expand All @@ -226,6 +236,41 @@ describe("input validation throws McpInvalidParams", () => {
});
});

describe("cerefox_metadata_search listing (empty filter + scope)", () => {
// A mock client that resolves any project name → "proj-1" and records the
// params passed to the cerefox_metadata_search RPC.
function listingClient(captured: { params?: Record<string, unknown> }): SupabaseClient {
const projectChain = {
select: () => projectChain,
ilike: () => projectChain,
limit: () => ({ data: [{ id: "proj-1" }], error: null }),
};
return {
rpc: (name: string, params: Record<string, unknown>) => {
if (name === "cerefox_metadata_search") captured.params = params;
return { data: [], error: null };
},
from: () => projectChain,
} as unknown as SupabaseClient;
}

test("project_name alone lists docs (empty filter → RPC gets {} + resolved project_id)", async () => {
const tool = TOOLS_BY_NAME["cerefox_metadata_search"];
const captured: { params?: Record<string, unknown> } = {};
await tool.handler(listingClient(captured), { project_name: "Cerefox" }, FAKE_CTX);
expect(captured.params?.p_metadata_filter).toEqual({});
expect(captured.params?.p_project_id).toBe("proj-1");
});

test("updated_since alone is a sufficient scope (no throw)", async () => {
const tool = TOOLS_BY_NAME["cerefox_metadata_search"];
const captured: { params?: Record<string, unknown> } = {};
await tool.handler(listingClient(captured), { updated_since: "2026-01-01" }, FAKE_CTX);
expect(captured.params?.p_metadata_filter).toEqual({});
expect(captured.params?.p_updated_since).toBe("2026-01-01");
});
});

describe("chunker (used by ingest)", () => {
test("short content → 1 chunk", async () => {
const { chunkMarkdown } = await import("../mcp-tools/_chunker.ts");
Expand Down
118 changes: 117 additions & 1 deletion _shared/mcp-tools/_projects.ts
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,9 @@
* prevents drift.
*/

import type { MCPSupabaseClient } from "./types.ts";
import type { AccessPath, MCPSupabaseClient } from "./types.ts";

import { logUsage } from "./_utils.ts";

/** Ensure `(documentId, project)` exists. Resolves project by name
* (case-insensitive); creates the project if missing. Idempotent.
Expand Down Expand Up @@ -104,6 +106,120 @@ export async function setDocumentProjectsByName(
return projectIds;
}

export interface ReplaceDocumentProjectsResult {
documentTitle: string;
/** Names after stripping blanks + case-insensitive dedup, in input order. */
cleanNames: string[];
projectIds: string[];
}

/**
* Full-set replace of a document's project memberships, with audit + usage
* logging. The shared core behind both the `cerefox_set_document_projects`
* MCP tool and the `cerefox document set-projects` CLI command, so the two
* behave identically.
*
* Cleans the incoming names (strip blanks, preserve order, case-insensitive
* dedup), verifies the document exists and isn't soft-deleted, resolves each
* name → project_id (creating the project if absent), then DELETE-then-INSERT
* replaces the membership set. An empty (or all-blank) list clears all
* memberships. Writes an `update-metadata` audit entry (content is untouched)
* and a usage-log entry.
*
* Throws if the document is missing or soft-deleted. Callers validate
* argument *shape* (e.g. that `projectNames` is an array of strings).
*/
export async function replaceDocumentProjects(
supabase: MCPSupabaseClient,
opts: {
documentId: string;
projectNames: string[];
author: string;
authorType: string;
accessPath: AccessPath;
},
): Promise<ReplaceDocumentProjectsResult> {
const { documentId, projectNames, author, authorType, accessPath } = opts;

// Strip empties; preserve order; dedup case-insensitively.
const seenLower = new Set<string>();
const cleanNames: string[] = [];
for (const n of projectNames) {
const stripped = (n ?? "").trim();
if (!stripped) continue;
const key = stripped.toLowerCase();
if (seenLower.has(key)) continue;
seenLower.add(key);
cleanNames.push(stripped);
}

// Verify the document exists and isn't soft-deleted.
const { data: doc } = await supabase
.from("cerefox_documents")
.select("id, title")
.eq("id", documentId)
.is("deleted_at", null)
.limit(1);
if (!doc?.length) {
throw new Error(`Document not found (or soft-deleted): ${documentId}`);
}

// Resolve each name → project_id (create if absent). Preserve order.
const projectIds: string[] = [];
for (const name of cleanNames) {
const { data: proj } = await supabase
.from("cerefox_projects")
.select("id")
.ilike("name", name)
.limit(1);
if (proj?.length) {
projectIds.push(proj[0].id);
} else {
const { data: newProj } = await supabase
.from("cerefox_projects")
.insert({ name })
.select("id");
if (newProj?.[0]?.id) projectIds.push(newProj[0].id);
}
}

// DELETE-then-INSERT replace (matches Python assign_document_projects).
await supabase.from("cerefox_document_projects").delete().eq("document_id", documentId);
if (projectIds.length > 0) {
const rows = projectIds.map((pid) => ({ document_id: documentId, project_id: pid }));
await supabase.from("cerefox_document_projects").insert(rows);
}

// Audit entry — project membership is metadata, not content.
try {
await supabase.rpc("cerefox_create_audit_entry", {
p_document_id: documentId,
p_version_id: null,
p_operation: "update-metadata",
p_author: author,
p_author_type: authorType,
p_size_before: null,
p_size_after: null,
p_description:
cleanNames.length > 0
? `Set document projects to [${cleanNames.join(", ")}]`
: "Cleared all project memberships",
});
} catch (err) {
console.warn("replaceDocumentProjects: audit entry failed", err);
}

logUsage(supabase, {
operation: "set-document-projects",
accessPath,
requestor: author,
document_id: documentId,
result_count: projectIds.length,
});

return { documentTitle: doc[0].title as string, cleanNames, projectIds };
}

/** Resolve a project name → project_id (case-insensitive), or `null` if
* not found. Does NOT create. Used by search / metadata-search to translate
* `project_name` parameters to UUIDs. */
Expand Down
Loading
Loading