Skip to content
30 changes: 29 additions & 1 deletion docs/developer/cli_reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -328,15 +328,29 @@ neotoma session --servers
- `--force`: Overwrite existing configuration.
- `--skip-db`: Skip database initialization.
- `--skip-env`: Skip interactive `.env` creation and variable prompts (e.g. for CI or non-interactive use).
- `--project-local`: Store the Neotoma config in `.neotoma/config.json` in the current directory (project-scoped) instead of the user-level `~/.config/neotoma/config.json`. The project-local config takes precedence over the user-level config when `readEffectiveConfig` is used. Use this when you want per-project Neotoma configuration that is independent of the user-level setup.
- `--safe`: Dry-run mode. Reports what `init` would do (create directories, write config, run migrations) without writing any files or making any changes. Output lists each planned action with a check mark. Exit code is 0 if everything would succeed. Combine with `--json` to get machine-readable output.

**Example:**
**Examples:**

```bash
# Basic initialization
neotoma init

# Initialize with custom data directory
neotoma init --data-dir /path/to/data

# Store config in current project directory instead of user home
neotoma init --project-local

# Preview what init would do without making any changes
neotoma init --safe

# Dry-run with machine-readable output
neotoma init --safe --json

# Combine: dry-run scoped to current project
neotoma init --safe --project-local
```

**What it creates:**
Expand All @@ -345,6 +359,20 @@ neotoma init --data-dir /path/to/data
- SQLite database: `<data-dir>/neotoma.db` (with WAL mode enabled)
- Encryption key (if user chooses key-derived auth when prompted): `~/.config/neotoma/keys/neotoma.key` (mode 0600).
- Environment file target: project `<checkout>/.env` when checkout is detected, otherwise `~/.config/neotoma/.env`
- Config file: `~/.config/neotoma/config.json` (default) or `.neotoma/config.json` in the current directory when `--project-local` is given.

**Runtime overrides** for `neotoma init`:

| Precedence | Source | Description |
|------------|--------|-------------|
| 1 (highest) | `--data-dir` flag | Explicit data directory path |
| 2 | `NEOTOMA_DATA_DIR` env var | Environment variable override |
| 3 (default) | Auto-detected or `~/neotoma/data` | Resolved at startup |

| Precedence | Source | Description |
|------------|--------|-------------|
| 1 (highest) | `--project-local` flag | Write to `.neotoma/config.json` in cwd |
| 2 (default) | (no flag) | Write to `~/.config/neotoma/config.json` |

### Harness setup

Expand Down
45 changes: 39 additions & 6 deletions docs/developer/mcp/instructions.md

Large diffs are not rendered by default.

16 changes: 16 additions & 0 deletions docs/foundation/what_to_store.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,22 @@ These before/after examples show what storage looks like in practice. "Before" i
- Before: You have an ongoing billing dispute with a vendor. Details are scattered across emails, chat messages, and phone call notes. Reconstructing the timeline requires manual archaeology.
- After: Agent stores each interaction as an observation on the dispute entity. `{ entity_type: "dispute", vendor: "Acme Billing", status: "open", amount_disputed: 250.00 }` with observations for each touchpoint. The full timeline is queryable: "What did we know about this dispute on March 15?"

## Store-first protocol for external tool actions

When an agent is about to execute a write action in an external tool — sending an email, creating a GitHub issue, posting a Slack message, booking a calendar event, triggering a webhook, or any equivalent operation that mutates state outside Neotoma — it MUST store the intent as a Neotoma entity BEFORE executing the action.

This is not optional. The store step is non-negotiable regardless of how simple or low-risk the external action appears.

The three-step sequence is:

1. **Store intent.** Store an entity capturing what will be done, to whom, and with what content. Use the most specific entity_type (e.g. `email_draft`, `issue`, `event`, `message`). Set `status: "pending"`.
2. **Execute the external action.** Call the external tool only after the store returns successfully.
3. **Update entity status.** After the external action completes or fails, update the stored entity with the outcome (`status: "sent"`, `status: "created"`, `status: "failed"`, plus any external IDs returned).

If the external action fails, the stored entity remains as durable evidence of the intent. If the action succeeds, the entity records what was done, when, and to whom — traceable and auditable.

This protocol extends the general store-first rule (which applies to reading external data) to cover writes. The agent instructions in `docs/developer/mcp/instructions.md` define the full binding rule and entity-type mapping under `[STORE-FIRST PROTOCOL]`.

## What NOT to store

| Condition | Reason |
Expand Down
146 changes: 146 additions & 0 deletions docs/subsystems/exercise_tracking.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,146 @@
# Exercise Tracking

Neotoma uses a per-set entity model for exercise data: each exercise set is stored as a separate `exercise_set` entity, and the workout session is stored as an `exercise_log` entity. Sets are linked to their parent session via a `PART_OF` relationship.

## Entity types

### `exercise_log` — workout session container

Represents one workout session (e.g. "Upper Body — 2026-05-16"). Fields:

| Field | Type | Required | Description |
|---|---|---|---|
| `date` | date | yes | Date of the workout |
| `workout_type` | string | no | Session label (e.g. "Upper Body", "Cardio", "Legs") |
| `duration_minutes` | number | no | Total session duration |
| `notes` | string | no | Free-form notes about the session |
| `location` | string | no | Where the workout took place |

Identity: `date + workout_type` (falls back to `date` alone for untyped sessions).

### `exercise_set` — one set within a session

Represents one atomic set performed during a session (e.g. "Bench Press, Set 1, 135 lbs × 8 reps, 2026-05-16"). Fields:

| Field | Type | Required | Description |
|---|---|---|---|
| `exercise_name` | string | yes | Name of the exercise (e.g. "Bench Press") |
| `date` | date | yes | Date the set was performed |
| `set_number` | number | no | Set index within the exercise (1, 2, 3, …) |
| `set_type` | string | no | Set type (e.g. "working", "warmup", "dropset") |
| `reps` | number | no | Repetitions completed |
| `weight_lbs` | number | no | Load in pounds |
| `weight_kg` | number | no | Load in kilograms |
| `duration_seconds` | number | no | Duration in seconds (timed sets) |
| `distance_meters` | number | no | Distance in meters (cardio sets) |
| `notes` | string | no | Free-form notes |

Identity: `exercise_name + set_number + date` (falls back to `exercise_name + date` when `set_number` is absent).

## Relationship model

```
exercise_set --PART_OF--> exercise_log
```

Each `exercise_set` entity carries a `PART_OF` relationship to its parent `exercise_log`. This follows the standard Neotoma one-to-many child model.

## Storing a workout

1. Store the `exercise_log` session entity first.
2. Store each `exercise_set` entity with a `PART_OF` relationship pointing to the log's `entity_id`.

### Example: one `store` call with inline relationships

```json
{
"entities": [
{
"entity_type": "exercise_log",
"date": "2026-05-16",
"workout_type": "Upper Body",
"duration_minutes": 60
},
{
"entity_type": "exercise_set",
"exercise_name": "Bench Press",
"date": "2026-05-16",
"set_number": 1,
"reps": 8,
"weight_lbs": 135
},
{
"entity_type": "exercise_set",
"exercise_name": "Bench Press",
"date": "2026-05-16",
"set_number": 2,
"reps": 8,
"weight_lbs": 135
}
],
"relationships": [
{ "relationship_type": "PART_OF", "source_index": 1, "target_index": 0 },
{ "relationship_type": "PART_OF", "source_index": 2, "target_index": 0 }
],
"idempotency_key": "workout-2026-05-16-upper-body"
}
```

### Example: two separate `store` calls

When the session is already stored:

```json
// Call 1 — store the session
{
"entities": [
{
"entity_type": "exercise_log",
"date": "2026-05-16",
"workout_type": "Upper Body"
}
],
"idempotency_key": "workout-2026-05-16-upper-body"
}

// Call 2 — store a set and link it
{
"entities": [
{
"entity_type": "exercise_set",
"exercise_name": "Bench Press",
"date": "2026-05-16",
"set_number": 1,
"reps": 8,
"weight_lbs": 135
}
],
"relationships": [
{
"relationship_type": "PART_OF",
"source_index": 0,
"target_entity_id": "<exercise_log entity_id from call 1>"
}
],
"idempotency_key": "exercise-set-bench-press-1-2026-05-16"
}
```

## Querying

Retrieve all sets for a session:

```
retrieve_related_entities(entity_id=<exercise_log entity_id>, relationship_type="PART_OF", direction="inbound")
```

Retrieve all sessions:

```
retrieve_entities(entity_type="exercise_log")
```

## Related entity types

- `workout_session` — an older session container that accumulates exercises as an `exercises` array field via `merge_array` reducer strategy. Use `exercise_log` + `exercise_set` for new data; `workout_session` remains supported for backward compatibility.
- `exercise` — a generic exercise activity entity without per-set breakdown. Use `exercise_set` when per-set granularity is needed.
Loading