Browser-native β’ Zero-server β’ LLM-agnostic memory management for web AI
Quickstart β’ Why β’ API β’ Quality β’ Changelog β’ Config β’ Airgap β’ Architecture β’ Export/Import
LokulMem is a local-first memory layer you can drop into any browser AI:
- it learns durable facts from conversation turns,
- stores them in the browser (IndexedDB),
- retrieves the right memories on each prompt,
- and injects them into context within a token budget.
No backend. No vendor lock-in. Works with any LLM that accepts a messages[] array.
Itβs βRAG-like recallβ + βmemory lifecycleβ (decay, pinning, contradiction history) + transparent debugging.
- Runs entirely in the browser: IndexedDB + Worker(s).
- LLM-agnostic: OpenAI / Anthropic / local WebLLM / anything with chat messages.
- Memory lifecycle: extract β store β decay/reinforce β retrieve β inspect/edit.
- Contradiction resolution: temporal updates vs conflicts, preserving lineage.
- Token-aware injection: uses your tokenizer (or a sensible fallback).
- Inspectable by design: optional debug output explains why each memory was used.
- DX-first defaults: βfetch once, cache foreverβ model loading.
- Airgap-ready: strict local model loading via
localModelBaseUrl. - OSS quality gates: CI enforces lint, typecheck, tests, memory evals, and package integrity.
npm i @lokul/lokulmem
# or
pnpm add @lokul/lokulmemimport { createLokulMem } from '@lokul/lokulmem';
// 1) Init once
const lokul = await createLokulMem({
dbName: 'my-chat-app',
extractionThreshold: 0.45,
});
// 2) Before calling your LLM
const { messages, debug } = await lokul.augment(
"Hey, I'm Alex. I prefer dark mode.",
history, // your existing ChatMessage[]
{
contextWindowTokens: 8192,
reservedForResponseTokens: 1024,
debug: true,
},
);
// 3) Call any model/provider
const assistantText = await myLLM(messages);
// 4) After the response: learn from the turn
await lokul.learn(
{ role: 'user', content: "Hey, I'm Alex. I prefer dark mode." },
{ role: 'assistant', content: assistantText },
);
// Optional: inspect why memories were injected
console.log(debug);Most βmemory layersβ are server-first, framework-bound, or opaque.
LokulMem is for you if you want:
- Privacy by architecture (data stays on-device)
- No backend to deploy or secure
- A clean, library-shaped API that works with any model
- A memory system that users can inspect, correct, pin, export, and delete
LokulMem has three core surfaces:
Returns a new messages[] array plus optional debug metadata.
const { messages, debug } = await lokul.augment(userMessage, history, {
contextWindowTokens: 8192,
reservedForResponseTokens: 1024,
debug: true,
});Extracts candidate memories from the last turn and writes them to IndexedDB.
const result = await lokul.learn(
{ role: 'user', content: userMessage },
{ role: 'assistant', content: assistantMessage },
);
console.log(result.extracted);
console.log(result.contradictions);For UI panels and power users.
const m = lokul.manage();
const items = await m.list({ status: "active" }); // returns MemoryDTO (no embeddings)
await m.pin(items[0].id);
const exported = await m.export('json');
await m.clear();
await m.import(exported, 'merge');LokulMem uses a strict data boundary:
MemoryRecord(internal) includesembedding: Float32Array.MemoryDTO(public API) omits embeddings entirely.
Why? Because typed arrays are expensive to structured-clone across IPC.
If you explicitly need embeddings (advanced), call APIs with includeEmbedding: true where supported.
const lokul = await createLokulMem({
dbName: 'my-chat-app',
workerUrl: undefined,
onnxPaths: '/lokulmem/onnx/',
localModelBaseUrl: undefined,
extractionThreshold: 0.45,
contextWindowTokens: 8192,
reservedForResponseTokens: 1024,
onProgress: (stage, progress) => console.log(stage, progress),
});| Option | Type | Default | Notes |
|---|---|---|---|
dbName |
string |
lokulmem-default |
IndexedDB namespace |
workerType |
`'auto' | 'shared' | 'dedicated' |
workerUrl |
string |
auto | Override worker script URL |
onnxPaths |
string | Record<string,string> |
β | Custom ONNX WASM asset paths |
localModelBaseUrl |
string |
β | Airgap/local model base path |
extractionThreshold |
number |
0.45 |
Global extraction floor |
contextWindowTokens |
number |
β | LLM context size for augment budget |
reservedForResponseTokens |
number |
1024 |
Response token reserve |
tokenCounter |
(text)=>number |
heuristic | Custom token counting |
onProgress |
(stage, progress)=>void |
β | Init progress callback |
Default mode is DX-first: download the embedding model once and cache it.
To run in strict airgapped mode:
- Host the model assets locally (mirroring the expected model layout)
- Point LokulMem at your local base URL
const lokul = await createLokulMem({
localModelBaseUrl: "/models/",
});If assets are missing, LokulMem should fail loudly with an actionable error.
sequenceDiagram
participant App as App (Main Thread)
participant LM as LokulMem API
participant W as Worker
participant DB as IndexedDB (Dexie)
App->>LM: augment(userMessage, history)
LM->>W: RPC.retrieve(query)
W->>DB: read active memories
W-->>LM: formatted memory block + debug
LM-->>App: new messages[]
App->>LM: learn(userMessage, assistantMessage)
LM->>W: RPC.extractAndStore(turn)
W->>DB: write memories / update lineage
W-->>LM: LearnResult
LM-->>App: result
memoriesβ durable facts (with embeddings)episodesβ optional conversation segmentsclustersβ k-means centroids (v0.1)edgesβ optional relationship links
When debug: true, augment() returns:
- timings (embedding, retrieval, formatting)
- candidate list with score breakdown
- excluded reasons (low score, token budget, status)
- final injected memories with human-readable reasons
This is intentionally built so you can ship a Memory Inspector UI.
- JSON export includes embeddings as Base64 to survive JSON.stringify.
- Export metadata includes
version,schemaVersion,modelName,embeddingDims.
const json = await lokul.manage().export('json');
await lokul.manage().clear();
await lokul.manage().import(json, 'merge');pnpm install
pnpm build
pnpm testLokulMem is shipped with regression guards designed for open-source maintenance.
npm run ci
npm run eval:memory:C
npm run verify:packageRun deterministic memory-quality evals locally:
npm run eval:memory:A
npm run eval:memory:B
npm run eval:memory:CGate policy:
A: bring-up metrics, hard-fail only assistant contaminationB: regression budget checks againsttests/evals/memory/baseline.jsonC: absolute quality thresholds + regression checks
Current threshold targets:
- meaningful recall >= 0.90
- noise false positive <= 0.05
- assistant contamination == 0
- supersession correctness >= 0.90
- canonicalization accuracy >= 0.95
- entity-link accuracy >= 0.85
- temporal state accuracy >= 0.90
- policy decision accuracy >= 0.88
- Core checks on Node 20 and 22: lint, typecheck, unit tests, build
- Memory quality gates:
A,B,C(tests/evals/memory) - Package integrity: dist export verification +
npm pack --dry-run - Security: CodeQL + dependency review + Dependabot updates
- Nightly browser smoke (non-blocking): Playwright integration sanity check
cd examples/react-app
pnpm install
pnpm devWorker fails to load
- Set
workerUrlexplicitly. - Confirm the published package includes the worker chunk.
ONNX WASM 404 / CSP blocked
- Set
onnxPathsto a valid local or hosted ORT asset path. - Confirm
ort-wasm*.wasmandort-wasm*.mjsare being served.
Airgap mode canβt find model
- Confirm
localModelBaseUrlis reachable. - Confirm model files exist under that path.
Contributions are welcome! Please read CONTRIBUTING.md for guidelines, development setup, and release checks.
MIT β see LICENSE.
