Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
53 changes: 51 additions & 2 deletions docs/testing-strategy.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,56 @@ context accumulation without external model variance. The browser E2E suite
uses the same principle at the HTTP and DOM layer: real runtime, fake upstream
provider.

### 2.5 Manual and ignored tests
### 2.5 Routine and heartbeat E2E test helpers

The `tests/support/routines` module (compiled only when the `libsql` feature is
enabled) provides reusable helpers for routine- and heartbeat-related E2E tests.
These helpers reduce boilerplate in individual test files and keep the setup
consistent across the five focused submodules under `tests/e2e_traces/`.

#### Available helpers

<!-- markdownlint-disable MD013 -->
| Helper | Kind | Description |
| ------ | ---- | ----------- |
| `create_test_db()` | `async fn` | Creates a temporary libSQL database in a `TempDir`, applies all migrations, and returns an `Arc<dyn Database>` plus the `TempDir` guard. Returns a `Result` so callers propagate errors with `?` or `.expect()` in tests. |
| `create_workspace(db)` | `fn` | Wraps a database reference in an `Arc<Workspace>` named `"default"`. |
| `make_routine(name, trigger, prompt)` | `fn` | Builds a `Routine` value with sensible test defaults (zero cooldown, enabled, empty state). The trigger and prompt are supplied by the caller. |
| `make_test_incoming_message(content)` | `fn` | Builds an `IncomingMessage` on channel `"test"` with `user_id = "default"` and the supplied content string. |
| `make_minimal_engine(trace, db, ws)` | `fn` | Constructs a `RoutineEngine` backed by a `TraceLlm` replaying the given `LlmTrace`, a default `ToolRegistry`, and a `SafetyLayer` with injection checking enabled. Returns both the engine and the mpsc notification receiver so tests can observe or drain the channel. |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (review_instructions): The acronym “mpsc” is introduced without being defined, which violates the requirement to define uncommon acronyms on first use.

Please expand “mpsc” (for example, “multi-producer, single-consumer (mpsc) notification receiver”) the first time it appears in the document.

Review instructions:

Path patterns: **/*.md

Instructions:
Define uncommon acronyms on first use.

| `register_github_issue_routine(db, engine)` | `async fn` | Inserts a `github`/`issue.opened` routine into the database and refreshes the engine's event cache. Used by system-event trigger tests. |
| `assert_system_event_count(engine, spec, expected, msg)` | `async fn` | Emits a system event described by `SystemEventSpec` and asserts the number of fired routines equals `expected`. |
<!-- markdownlint-enable MD013 -->

#### `SystemEventSpec`

`SystemEventSpec<'a>` is a plain data struct that groups the three fields
needed to describe a system event in tests:

- `source` — the event source (e.g., `"github"`).
- `event_type` — the event type string (e.g., `"issue.opened"`).
- `payload` — a `serde_json::Value` carrying event-specific metadata.

Construct it with `SystemEventSpec::new(source, event_type, payload)`.

#### Writing a new routine E2E test

A typical routine E2E test follows this pattern:

1. Call `create_test_db().await.expect(...)` to provision a database.
2. Call `create_workspace(&db)` to create the workspace.
3. Build a `LlmTrace` with the expected LLM responses for the test scenario.
4. Call `make_minimal_engine(trace, db.clone(), ws)` to get an engine and
notification receiver.
5. Call `make_routine(...)` to build the routine value, customise any fields
(e.g., `next_fire_at`, `guardrails.cooldown`), then persist it with
`db.create_routine(&routine).await.expect(...)`.
6. For event-triggered tests, call `engine.refresh_event_cache().await` after
inserting routines.
7. Poll `db.list_routine_runs(routine.id, 10)` in a bounded loop (rather than
sleeping for a fixed duration) to wait for the spawned task to complete.

### 2.6 Manual and ignored tests

Some tests are intentionally excluded from the default path because they need
manual setup or environmental control.
Expand Down Expand Up @@ -308,4 +357,4 @@ Table 3. Suggested validation depth by change type.
The important pattern is not "run everything, every time". The strategy is to
keep the default path fast enough for normal work, keep higher-confidence paths
available for risky changes, and preserve deterministic suites, so failures are
actionable rather than noisy.
actionable rather than noisy.
14 changes: 11 additions & 3 deletions tests/support/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -360,7 +360,11 @@ const _: fn() = test_rig_symbol_refs;
const _: fn() = routines_symbol_refs;

#[cfg(feature = "libsql")]
#[allow(dead_code)]
#[expect(
dead_code,
reason = "compile-time signature assertion function; invoked only via a \
const reference to force type-checking at build time"
)]
fn routines_symbol_refs() {
// Compile-time type assertions for routines module helpers.
// These ensure the public API signatures remain stable.
Expand All @@ -373,7 +377,11 @@ fn routines_symbol_refs() {
&str,
) -> ironclaw::agent::routine::Routine = routines::make_routine;
const _: fn(&str) -> ironclaw::channels::IncomingMessage = routines::make_test_incoming_message;
#[allow(clippy::type_complexity)]
#[expect(
clippy::type_complexity,
reason = "the tuple return type is intentional: this const asserts the full \
make_minimal_engine signature including the mpsc receiver"
)]
const _: fn(
trace_llm::LlmTrace,
std::sync::Arc<dyn ironclaw::db::Database>,
Expand All @@ -382,4 +390,4 @@ fn routines_symbol_refs() {
std::sync::Arc<ironclaw::agent::routine_engine::RoutineEngine>,
tokio::sync::mpsc::Receiver<ironclaw::channels::OutgoingResponse>,
) = routines::make_minimal_engine;
}
}
156 changes: 147 additions & 9 deletions tests/support/routines.rs
Original file line number Diff line number Diff line change
Expand Up @@ -24,14 +24,22 @@ use ironclaw::workspace::Workspace;
use crate::support::trace_llm::{LlmTrace, TraceLlm};

/// Describes a system event to be emitted in tests.
#[allow(dead_code)]
#[expect(
dead_code,
reason = "public test-support struct used across feature-gated E2E test modules; \
the compiler cannot see cross-module test usage"
)]
pub struct SystemEventSpec<'a> {
pub source: &'a str,
pub event_type: &'a str,
pub payload: serde_json::Value,
}

#[allow(dead_code)]
#[expect(
dead_code,
reason = "public test-support constructor used across feature-gated E2E test modules; \
the compiler cannot see cross-module test usage"
)]
impl<'a> SystemEventSpec<'a> {
pub fn new(source: &'a str, event_type: &'a str, payload: serde_json::Value) -> Self {
Self {
Expand All @@ -43,7 +51,11 @@ impl<'a> SystemEventSpec<'a> {
}

/// Create a temp libSQL database with migrations applied.
#[allow(dead_code)]
#[expect(
dead_code,
reason = "public test-support helper used across feature-gated E2E test modules; \
the compiler cannot see cross-module test usage"
)]
pub async fn create_test_db() -> Result<(Arc<dyn Database>, TempDir), Box<dyn std::error::Error>> {
use ironclaw::db::libsql::LibSqlBackend;

Expand All @@ -56,13 +68,21 @@ pub async fn create_test_db() -> Result<(Arc<dyn Database>, TempDir), Box<dyn st
}

/// Create a workspace backed by the test database.
#[allow(dead_code)]
#[expect(
dead_code,
reason = "public test-support helper used across feature-gated E2E test modules; \
the compiler cannot see cross-module test usage"
)]
pub fn create_workspace(db: &Arc<dyn Database>) -> Arc<Workspace> {
Arc::new(Workspace::new_with_db("default", db.clone()))
}

/// Helper to insert a routine directly into the database.
#[allow(dead_code)]
#[expect(
dead_code,
reason = "public test-support helper used across feature-gated E2E test modules; \
the compiler cannot see cross-module test usage"
)]
pub fn make_routine(name: &str, trigger: Trigger, prompt: &str) -> Routine {
Routine {
id: Uuid::new_v4(),
Expand Down Expand Up @@ -93,7 +113,11 @@ pub fn make_routine(name: &str, trigger: Trigger, prompt: &str) -> Routine {
}

/// Build a minimal IncomingMessage for event-trigger tests.
#[allow(dead_code)]
#[expect(
dead_code,
reason = "public test-support helper used across feature-gated E2E test modules; \
the compiler cannot see cross-module test usage"
)]
pub fn make_test_incoming_message(content: &str) -> IncomingMessage {
IncomingMessage {
id: Uuid::new_v4(),
Expand All @@ -110,7 +134,11 @@ pub fn make_test_incoming_message(content: &str) -> IncomingMessage {
}

/// Build a minimal RoutineEngine from a TraceLlm, returning both the engine and the notify receiver.
#[allow(dead_code)]
#[expect(
dead_code,
reason = "public test-support helper used across feature-gated E2E test modules; \
the compiler cannot see cross-module test usage"
)]
pub fn make_minimal_engine(
trace: LlmTrace,
db: Arc<dyn Database>,
Expand Down Expand Up @@ -140,7 +168,11 @@ pub fn make_minimal_engine(
}

/// Register a GitHub issue routine for system event tests.
#[allow(dead_code)]
#[expect(
dead_code,
reason = "public test-support helper used across feature-gated E2E test modules; \
the compiler cannot see cross-module test usage"
)]
pub async fn register_github_issue_routine(
db: &Arc<dyn Database>,
engine: &RoutineEngine,
Expand All @@ -162,7 +194,11 @@ pub async fn register_github_issue_routine(
}

/// Assert that a system event fires the expected number of routines.
#[allow(dead_code)]
#[expect(
dead_code,
reason = "public test-support helper used across feature-gated E2E test modules; \
the compiler cannot see cross-module test usage"
)]
pub async fn assert_system_event_count(
engine: &RoutineEngine,
spec: SystemEventSpec<'_>,
Expand All @@ -174,3 +210,105 @@ pub async fn assert_system_event_count(
.await;
assert_eq!(fired, expected, "{msg}");
}

#[cfg(test)]
mod tests {
use ironclaw::agent::routine::Trigger;

use super::{SystemEventSpec, create_test_db, create_workspace, make_routine,
make_test_incoming_message};

#[test]
fn make_routine_sets_fields() {
let trigger = Trigger::Cron {
schedule: "* * * * *".to_string(),
timezone: None,
};
let routine = make_routine("my-routine", trigger, "Do the thing.");

assert_eq!(routine.name, "my-routine", "name should match argument");
assert!(
routine.description.contains("my-routine"),
"description should reference the routine name"
);
assert_eq!(
routine.user_id, "default",
"user_id should be 'default' for test routines"
);
assert!(routine.enabled, "test routines should be enabled");
assert_eq!(
routine.run_count, 0,
"run_count should start at zero"
);
assert!(routine.last_run_at.is_none(), "last_run_at should be None");
assert!(routine.next_fire_at.is_none(), "next_fire_at should be None");

match &routine.action {
ironclaw::agent::routine::RoutineAction::Lightweight { prompt, .. } => {
assert_eq!(prompt, "Do the thing.", "action prompt should match argument");
}
}
}

#[test]
fn make_test_incoming_message_sets_content() {
let msg = make_test_incoming_message("hello world");

assert_eq!(
msg.content, "hello world",
"content should match argument"
);
assert_eq!(
msg.channel, "test",
"channel should be 'test' for test messages"
);
assert_eq!(
msg.user_id, "default",
"user_id should be 'default' for test messages"
);
assert!(msg.user_name.is_none(), "user_name should be None");
assert!(msg.thread_id.is_none(), "thread_id should be None");
assert!(msg.attachments.is_empty(), "attachments should be empty");
}

#[test]
fn system_event_spec_new_stores_fields() {
let payload = serde_json::json!({"key": "value"});
let spec = SystemEventSpec::new("github", "issue.opened", payload.clone());

assert_eq!(spec.source, "github", "source should match argument");
assert_eq!(
spec.event_type, "issue.opened",
"event_type should match argument"
);
assert_eq!(spec.payload, payload, "payload should match argument");
}

#[tokio::test]
async fn create_test_db_returns_usable_database() {
let (db, _tmp) = create_test_db()
.await
.expect("create_test_db should succeed");

// Verify the database is functional by listing routines (empty result is fine).
let routines = db.list_routines("default").await;
assert!(
routines.is_ok(),
"database should be usable after creation: {routines:?}"
);
}

#[tokio::test]
async fn create_workspace_returns_workspace_with_correct_name() {
let (db, _tmp) = create_test_db()
.await
.expect("create_test_db should succeed");
let ws = create_workspace(&db);

assert_eq!(
ws.name(),
"default",
"workspace name should be 'default'"
);
}
}
Loading