-
-
-# Werkr Server/Agent features:
-
-## A Workflow-Centric Design:
-The Werkr project primarily operates on a workflow, or directed acyclic graph (DAG), model.
-The workflow model and DAG visualizations allow you to easily create and manage series of interconnected tasks.
-
-
-
-## Schedulable Tasks:
-Tasks are the fundamental building blocks of your automation workflows.
-Tasks can be scheduled to run inside or outside of a workflow.
- * Tasks outside a workflow can be scheduled to run at specific times, on pre-defined and cyclical schedules.
- * Tasks ran inside a workflow have additional trigger mechanisms[*](#flexible-task-triggers).
-
-
-
-## Versatile Task Types:
-Choose from five primary task types to build your workflow(s):
-
-
-### System-Defined Tasks:
-Perform common operations like file and directory manipulation with ease, thanks to Werkr's prebuilt system tasks.
-Enjoy consistent output parameters and error handling for the following operations:
- * File and directory creation.
- * Moving and copying files and directories.
- * Deleting files and directories.
- * Determine whether files or directories exist.
- * Write pre-defined and dynamic content to a file.
-
-
-### PowerShell Script Execution:
-Run PowerShell scripts effortlessly and receive standard PowerShell outputs.
-
-
-### PowerShell Command Execution:
-Execute PowerShell commands and access standard PowerShell outputs.
-
-
-### System Shell Command Execution:
-Run commands in your operating system's native shell and get the exit code from the command execution.
-
-
-### User-Defined Tasks:
-Customize your workflows by creating your own tasks. Combine system-defined tasks, PowerShell scripts or commands,
-and native command executions into your own free-form repeatable task.
-Branch, iterate, and handle exceptions with ease!
-
-
-
-## Flexible Task Triggers:
-Start your tasks using various triggers, or combinations of triggers, including:
- * FileWatch
- * Monitor file system events in real-time or by polling a path periodically.
- * DateTime
- * Set a specific time to run your tasks.
- * On an Interval/Cyclically
- * Run tasks periodically.
- * Task Completion States
- * Trigger tasks based on the completion state of other tasks within the same workflow.
- * Workflow Completion State
- * Trigger tasks based on the operating state of an outside workflow.
-
-
-
-
-
-# Example Use Cases:
-* (Placeholder)
-
-
-
-
-
-# Security:
-The Werkr project has a wide variety of very powerful tools. So, security is taken quite seriously and there are some
-mandatory steps that must be taken to set up the server and agent initially.
-
-* TLS certificates are mandatory for the scheduler and agent.
-* The server and agent undergo an API key registration process before tasks can be run on the system.
- * The agent generates an API key on first startup (and upon request thereafter). The generated API key must be
- registered with a server within 12 hours of its generation.
- * The server and agent then perform a mutual registration process using the API key where they record the opposing
- parties' certificate information (ex HSTS information?).
-
-## Additional Security Considerations:
-* Access Control
- * The scheduler has built-in user roles that make it easy to restrict access to key and sensitive parts of the system.
-* Allowed Hosts
- * Both the scheduler and agent can restrict access via an allowed hosts list.
-* Native 2fa support (TOTP) is built in to the scheduler.
-
-
-
-
-
-# Licensing and Support:
-The Werkr Task Automation Project is offered free of charge, without any warranties, under an
-[MIT license](https://docs.werkr.app/LICENSE.html)!
-Unfortunately, it does not come with any form of guaranteed or implied support.
-Best effort support and triage will be offered on a volunteer basis via a
-[GitHub issue](https://werkr.App/issues/new/choose) process.
-
-
-
-
-
-# Quick Start Guide:
-* (Placeholder)
-* Example 1: ...
-* Example 2: ...
-
-
-
-
-
-# Contributing:
-The Werkr Task Automation Project is in its early stages and we're excited that you're interested in contributing!
-We believe that open collaboration is key to the project's success and growth.
-We welcome contributions from developers, users, and anyone interested in task automation and workflow orchestration.
-
-All official project collaboration will occur via
-[GitHub issues](https://werkr.App/issues/new/choose) or [discussions](https://werkr.App/discussions).
-
-The project has been split into multiple different repositories to keep thing more specific and focused,
-so when looking for code please be aware of the following repositories.
-* [Werkr.App](https://werkr.App)
- * The primary documentation repository. Also hosts github pages.
-* [Werkr.Server](https://server.werkr.app)
- * The scheduler and primary UI interface for the project.
-* [Werkr.Agent](https://agent.werkr.app)
- * The agent software that performs the requested tasks.
-* [Werkr.Common](https://common.werkr.app)
- * A shared library used by both the Werkr Server and Agent.
-* [Werkr.Common.Configuration](https://commonconfiguration.werkr.app)
- * A shared configuration library used by both the Werkr Server and Agent. This is also used by the windows installer.
-* [Werkr.Installers](https://installers.werkr.app)
- * A shared [Wix](https://wixtoolset.org/) CustomAction library used by both the Werkr Server and Agent.
- This library is used in the Msi install process.
-
-## Feedback, Suggestions, and Feature Requests:
-Do you have an idea for a new feature or enhancement? We'd love to hear it!
-As the project is still in its early stages, your feedback and suggestions are invaluable.
-We encourage you to share your thoughts on features, improvements, and potential use cases.
-You can submit your ideas by creating a
-[new feature request](https://werkr.App/issues/new?template=feature_request.yaml).
-Be sure to provide a clear description of your proposal and its potential benefits.
-
-## Documentation Improvements:
-If you have suggestions for additional documentation, or corrections for existing documentation, then please submit a
-[documentation improvement request](https://werkr.App/issues/new?template=improve_documentation.yaml).
-
-## Bug Reports:
-Please report any bugs, performance issues, or security vulnerabilities you encounter while using the Werkr Task
-Automation project by opening a
-[new bug report](https://werkr.App/issues/new?&template=bug_report.yaml).
-Be sure to include as much information as possible, such as steps to reproduce the issue, any error messages,
-your system's configuration, and any additional context you think we should be aware of.
-
-## Code Contributions:
-If you'd like to contribute code directly to the project, please fork the repository, create a new branch, and submit
-a pull request with your changes. We encourage you to follow our existing coding style and conventions.
-Make sure to include a detailed description of your changes in the pull request.
-
-Additionally you will need to agree to the
-[Contribution License Agreement](https://werkr.App/issues/new?template=cla_agreement.yml)
-before your PR will be merged.
-
-We appreciate all contributions, big or small, and look forward to building a vibrant and collaborative community
-around the Werkr Task Automation Project. Thank you for your support!
+# Werkr — Open Source Task Automation & Workflow Orchestration
+
+
+
+Werkr is a task automation and workflow orchestration platform built on .NET 10. You can schedule individual tasks, chain them together into directed acyclic graph (DAG) workflows, and let Werkr handle the execution across your infrastructure.
+
+The project has three core components — a **Server** (Blazor UI + Identity), an **API** (application data and gRPC services), and an **Agent** (task execution worker). Server-to-API and user-facing connections use HTTPS; API-to-Agent communication uses encrypted gRPC with AES-256-GCM envelope encryption.
+
+Currently supported on **Windows 10+** and **Linux** (x64 and arm64). macOS support is planned.
+
+
+
+# Task Management
+
+You can predefine tasks to run on a schedule, create ad-hoc tasks to run immediately, set start and end times, or combine tasks into workflow DAGs for more complex automation. Workflows support dependency-based execution, branching logic, and condition evaluation.
+
+Visit [docs.werkr.app](https://docs.werkr.app) to explore the full documentation.
+
+
+
+# Downloads
+
+- [Werkr Releases](https://github.com/DarkgreyDevelopment/Werkr.App/releases/latest)
+
+Both Server and Agent are offered as MSI installers (Windows) and portable editions. Once installed, there is no difference between the portable and installed versions.
+
+For Windows, download the latest MSI installer for your CPU architecture (most likely x64).
+
+
+
+# Features
+
+## Workflow-Centric Design
+
+Werkr operates primarily on a workflow (DAG) model. You create tasks, link them together as workflow steps with dependency declarations, and Werkr handles topological ordering and execution. The `ConditionEvaluator` supports branching logic within workflows based on step outcomes.
+
+See `src/Werkr.Core/Workflows/` for the workflow engine implementation.
+
+
+
+## Schedulable Tasks
+
+Tasks are the building blocks of your automation. They can run standalone on a schedule or as steps within a workflow.
+
+- **Standalone tasks** can be triggered on DateTime schedules or at recurring intervals (daily, weekly, monthly).
+- **Workflow tasks** are additionally triggered by dependency completion within the DAG, using configurable `DependencyMode` settings.
+- **Holiday Calendar** support lets you skip or shift scheduled occurrences on configured holidays, with audit logging for suppressed runs.
+
+See `src/Werkr.Core/Scheduling/` for schedule calculation and holiday date handling.
+
+
+
+## Task Types
+
+Werkr supports five task types (defined in the `TaskActionType` enum):
+
+### Action
+Built-in handlers for common operations — no scripting required. The current set of 26 action handlers covers file operations (copy, move, rename, create, delete, read, write, clear, find and replace, test existence, get info), directory operations (create, list), process control (start, stop), network and integration (HTTP request, test connectivity, send email, send webhook, file download, file upload), archive operations (compress, extract), JSON manipulation, a delay timer, and file event watching. Each action has consistent parameter handling and error reporting.
+
+See `src/Werkr.Agent/Operators/Actions/` for the full set of action handlers.
+
+### PowerShell Script
+Run PowerShell scripts with an embedded PowerShell 7+ host. You get standard PowerShell output streams (output, error, debug, verbose, warning), exit codes, and exception information.
+
+### PowerShell Command
+Execute individual PowerShell commands with the same output handling as script execution.
+
+### Shell Command
+Run commands in your operating system's native shell (cmd on Windows, bash/sh on Linux) and receive the process exit code.
+
+### Shell Script
+Execute shell scripts with the same native shell and exit code handling as shell commands.
+
+For complex multi-step automation, combine tasks into a **Workflow** (DAG) with dependency-based execution, branching, and condition evaluation.
+
+
+
+## Flexible Triggers
+
+- **DateTime** — Run tasks at a specific date and time.
+- **Interval/Cyclical** — Run tasks periodically (daily, weekly, monthly recurrence with repeat intervals).
+- **Task Completion** — Within a workflow, trigger steps based on the completion state of their dependencies (via `ConditionEvaluator` and `DependencyMode`).
+- **Holiday Calendar** — Automatically skip or shift occurrences on configured holidays.
+
+
+
+# 1.0 Roadmap
+
+The [Design Specification](docs/1.0-Target-Featureset.md) defines every capability required for the 1.0 release. Key features beyond what is currently implemented:
+
+- **Composite nodes** — ForEach, While, Do, and Switch nodes for iteration, looping, and conditional branching within workflows.
+- **Task & workflow versioning** — Immutable versions on every save, snapshot binding between workflow steps and task versions, and on-demand version diffs.
+- **Additional trigger types** — Cron expressions, persistent file monitoring, authenticated API triggers, workflow-completion triggers, and manual triggers from a unified trigger registry.
+- **Expanded action handlers** — OS service management (Windows Services, systemd, launchd).
+- **Workflow variables & expressions** — Typed variable system with step output capture, namespaced scoping, collection types, and a condition expression language for branching and loop constructs.
+- **Manual approval gates** — Pause workflow execution at designated steps until a human approves continuation.
+- **JSON import/export** — Portable, schema-versioned workflow definitions for backup, migration, and version control.
+- **Error handling & retry** — Configurable per-step strategies (fail workflow, skip, continue, run error handler, remediate before retry) with fixed, linear, or exponential backoff.
+- **Sensitive data redaction** — Regex-based automatic masking of passwords, tokens, and secrets in execution logs.
+- **Centralized configuration & credential management** — Database-backed settings with hot reload, encrypted credential storage with injection into task execution contexts.
+- **Notifications** — Email, webhook, and in-app notification channels with configurable subscriptions and templates.
+- **Enhanced security** — WebAuthn passkeys, database encryption at rest, scoped API keys with rate limiting, outbound request allowlisting, and Content Security Policy headers.
+- **Versioned REST API** — OpenAPI-documented endpoints with pagination, filtering, and CORS policy.
+- **Real-time UI** — SignalR-powered live updates for workflow run monitoring and log streaming.
+- **Re-execution & replay** — Resume from a failed step (preserving completed outputs) or replay an entire workflow from the beginning.
+
+See the full [Design Specification](docs/1.0-Target-Featureset.md) for complete details on every 1.0 capability.
+
+
+
+# Security
+
+Security is a core design concern — there are mandatory steps for initial setup, and multiple layers protect the system at runtime.
+
+- **TLS certificates** are mandatory for all Server, API, and Agent connections.
+- **Agent registration** uses an admin-bundle model: an administrator creates a registration bundle on the Server containing the Server's RSA public key, transfers it to the Agent out-of-band, and the Agent completes registration via an encrypted gRPC handshake using RSA+AES hybrid encryption. This establishes a shared AES-256 symmetric key for all subsequent communication.
+- **Encrypted gRPC** — After registration, every gRPC payload is wrapped in an `EncryptedEnvelope` (AES-256-GCM). Key rotation is supported via the `RotateSharedKey` RPC.
+- **RBAC** — The Server has built-in permission-based role authorization to control access to features and data.
+- **TOTP 2FA** — Native two-factor authentication is built into the Server.
+- **Path allowlisting** — Agents validate file paths against a configurable allowlist before execution.
+- **Platform-native secret storage** — Secrets are stored using OS-native mechanisms (DPAPI on Windows, Keychain on macOS, file-based on Linux).
+
+The 1.0 release adds WebAuthn passkey authentication, database encryption at rest, scoped API keys, centralized credential management, outbound request controls, and Content Security Policy headers. See the [Design Specification](docs/1.0-Target-Featureset.md) §9 for the full security model.
+
+See [Architecture.md](docs/Architecture.md) for the current security model breakdown.
+
+
+
+# Licensing and Support
+
+The Werkr project is offered free of charge, without any warranties, under an [MIT license](https://docs.werkr.app/LICENSE.html).
+
+Best effort support and triage is provided on a volunteer basis via [GitHub issues](https://github.com/DarkgreyDevelopment/Werkr.App/issues/new/choose).
+
+
+
+# Quick Start Guide
+
+For developer setup (building from source, running locally with Aspire, running tests), see [Development.md](docs/Development.md).
+
+For end-user installation, see the [Windows Server Install](docs/articles/HowTo/WindowsServerInstall.md) and [Windows Agent Install](docs/articles/HowTo/WindowsAgentInstall.md) guides.
+
+
+
+# Contributing
+
+The Werkr project is in its early stages and we're excited that you're interested in contributing! We welcome contributions from developers, users, and anyone interested in task automation and workflow orchestration.
+
+All official project collaboration happens via [GitHub issues](https://github.com/DarkgreyDevelopment/Werkr.App/issues/new/choose) or [discussions](https://github.com/DarkgreyDevelopment/Werkr.App/discussions).
+
+## Project Structure
+
+Werkr is a monorepo with all components under `src/`:
+
+| Project | Purpose |
+|---------|---------|
+| `Werkr.Server` | Blazor Server UI, ASP.NET Identity, SignalR, user authentication |
+| `Werkr.Api` | Application API, gRPC service host, schedule/task/workflow management |
+| `Werkr.Agent` | Task execution engine, embedded PowerShell host, built-in actions |
+| `Werkr.Core` | Shared business logic — scheduling, workflows, registration, cryptography |
+| `Werkr.Common` | Shared models, protobuf definitions, auth policies |
+| `Werkr.Common.Configuration` | Strongly-typed configuration classes |
+| `Werkr.Data` | EF Core database contexts and entities (PostgreSQL + SQLite) |
+| `Werkr.Data.Identity` | ASP.NET Identity database contexts and roles |
+| `Werkr.AppHost` | .NET Aspire orchestrator for local development |
+| `Werkr.ServiceDefaults` | Aspire service defaults (OpenTelemetry, health checks) |
+| `Installer/Msi/` | WiX MSI installer projects and custom actions |
+| `Test/Werkr.Tests` | Integration tests (Testcontainers + WebApplicationFactory) |
+| `Test/Werkr.Tests.Data` | Data layer unit tests |
+| `Test/Werkr.Tests.Server` | Server integration tests |
+| `Test/Werkr.Tests.Agent` | Agent end-to-end tests |
+
+See [Architecture.md](docs/Architecture.md) for the full architectural overview and [Development.md](docs/Development.md) for build/test/contribution instructions.
+
+## Feedback, Suggestions, and Feature Requests
+
+We'd love to hear your ideas! Submit a [feature request](https://github.com/DarkgreyDevelopment/Werkr.App/issues/new?template=feature_request.yaml) with a clear description of your proposal and its potential benefits.
+
+## Documentation Improvements
+
+Have suggestions or corrections for the documentation? Submit a [documentation improvement request](https://github.com/DarkgreyDevelopment/Werkr.App/issues/new?template=improve_documentation.yaml).
+
+## Bug Reports
+
+Please report any bugs, performance issues, or security vulnerabilities by opening a [bug report](https://github.com/DarkgreyDevelopment/Werkr.App/issues/new?template=bug_report.yaml). Include steps to reproduce the issue, error messages, your system configuration, and any additional context.
+
+## Code Contributions
+
+Fork the repository, create a new branch from `develop`, and submit a pull request with your changes. Please follow the coding conventions described in [Development.md](docs/Development.md) and include a detailed description in the pull request.
+
+You will need to agree to the [Contribution License Agreement](https://github.com/DarkgreyDevelopment/Werkr.App/issues/new?template=cla_agreement.yml) before your PR is merged.
+
+We appreciate all contributions and look forward to building a collaborative community around Werkr. Thank you for your support!
diff --git a/Werkr.slnx b/Werkr.slnx
new file mode 100644
index 0000000..d8ea0ee
--- /dev/null
+++ b/Werkr.slnx
@@ -0,0 +1,65 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/docker-compose.yml b/docker-compose.yml
new file mode 100644
index 0000000..24df598
--- /dev/null
+++ b/docker-compose.yml
@@ -0,0 +1,171 @@
+# ---------------------------------------------------------------------------
+# Werkr — Docker Compose (Server + API + Agent + PostgreSQL)
+#
+# Usage:
+# pwsh scripts/docker-build.ps1 # generates certs (first run) + builds images
+# docker compose up -d # start all services
+# docker compose down -v # stop & remove volumes
+#
+# For .deb-based builds (production):
+# pwsh scripts/docker-build.ps1 -Deb
+# docker compose up -d
+#
+# To override build mode via env:
+# BUILD_MODE=deb docker compose build
+#
+# The Server UI is available at https://localhost:5050
+# Default admin credentials are seeded on first start.
+# API: https://localhost:5001 Agent: https://localhost:5100
+#
+# TLS certificates are generated by docker-build.ps1 into certs/.
+# Control plane cert (Server + API): certs/werkr-server.pfx
+# Agent cert: certs/werkr-agent.pfx
+# CA cert (for verification): certs/werkr-ca.pem
+# ---------------------------------------------------------------------------
+
+services:
+ # ---------- PostgreSQL ----------
+ postgres:
+ image: postgres:17-alpine
+ restart: unless-stopped
+ environment:
+ POSTGRES_USER: werkr
+ POSTGRES_PASSWORD: werkr_dev_password
+ POSTGRES_DB: werkrdb
+ ports:
+ - "5432:5432"
+ volumes:
+ - pgdata:/var/lib/postgresql/data
+ healthcheck:
+ test: ["CMD-SHELL", "pg_isready -U werkr -d werkrdb"]
+ interval: 5s
+ timeout: 5s
+ retries: 10
+
+ # ---------- Werkr API ----------
+ werkr-api:
+ image: ${DOCKER_REGISTRY:-ghcr.io/werkr}/werkr-api:${DOCKER_TAG:-latest}
+ platform: linux/amd64
+ build:
+ context: .
+ dockerfile: src/Werkr.Api/Dockerfile
+ args:
+ BUILD_MODE: ${BUILD_MODE:-source}
+ restart: unless-stopped
+ depends_on:
+ postgres:
+ condition: service_healthy
+ environment:
+ ASPNETCORE_ENVIRONMENT: Development
+ DOTNET_ENVIRONMENT: Development
+ ASPNETCORE_URLS: "https://+:8443"
+ ASPNETCORE_Kestrel__Certificates__Default__Path: /app/certs/werkr-server.pfx
+ ASPNETCORE_Kestrel__Certificates__Default__Password: werkr-dev
+ SSL_CERT_FILE: /app/certs/werkr-ca.pem
+ ConnectionStrings__werkrdb: "Host=postgres;Port=5432;Database=werkrdb;Username=werkr;Password=werkr_dev_password"
+ Werkr__ServerUrl: "https://werkr-api:8443"
+ Jwt__SigningKey: "werkr-dev-signing-key-do-not-use-in-production-min32chars!"
+ Jwt__Issuer: "werkr-api"
+ Jwt__Audience: "werkr"
+ WERKR_CONFIG_PATH: /app/config
+ volumes:
+ - api-config:/app/config
+ - ./certs/werkr-server.pfx:/app/certs/werkr-server.pfx:ro
+ - ./certs/werkr-ca.pem:/app/certs/werkr-ca.pem:ro
+ ports:
+ - "5001:8443"
+ healthcheck:
+ test: ["CMD", "curl", "--cacert", "/app/certs/werkr-ca.pem", "-f", "https://localhost:8443/health"]
+ interval: 10s
+ timeout: 10s
+ retries: 10
+ start_period: 45s
+
+ # ---------- Werkr Agent ----------
+ werkr-agent:
+ image: ${DOCKER_REGISTRY:-ghcr.io/werkr}/werkr-agent:${DOCKER_TAG:-latest}
+ platform: linux/amd64
+ build:
+ context: .
+ dockerfile: src/Werkr.Agent/Dockerfile
+ args:
+ BUILD_MODE: ${BUILD_MODE:-source}
+ restart: unless-stopped
+ depends_on:
+ werkr-api:
+ condition: service_healthy
+ environment:
+ ASPNETCORE_ENVIRONMENT: Development
+ DOTNET_ENVIRONMENT: Development
+ ASPNETCORE_URLS: "https://+:8443"
+ ASPNETCORE_Kestrel__Certificates__Default__Path: /app/certs/werkr-agent.pfx
+ ASPNETCORE_Kestrel__Certificates__Default__Password: werkr-dev
+ SSL_CERT_FILE: /app/certs/werkr-ca.pem
+ Agent__Name: "Docker Agent"
+ Agent__EnablePowerShell: "true"
+ Agent__EnableSystemShell: "true"
+ Werkr__AgentUrl: "https://werkr-agent:8443"
+ WERKR_CONFIG_PATH: /app/config
+ WERKR_DATA_DIR: /var/lib/werkr
+ JobOutput__OutputDirectory: /var/lib/werkr/job-output
+ volumes:
+ - agent-data:/var/lib/werkr
+ - agent-config:/app/config
+ - ./certs/werkr-agent.pfx:/app/certs/werkr-agent.pfx:ro
+ - ./certs/werkr-ca.pem:/app/certs/werkr-ca.pem:ro
+ ports:
+ - "5100:8443"
+ healthcheck:
+ test: ["CMD", "curl", "--cacert", "/app/certs/werkr-ca.pem", "-f", "https://localhost:8443/health"]
+ interval: 10s
+ timeout: 10s
+ retries: 10
+ start_period: 45s
+
+ # ---------- Werkr Server (Blazor UI) ----------
+ werkr-server:
+ image: ${DOCKER_REGISTRY:-ghcr.io/werkr}/werkr-server:${DOCKER_TAG:-latest}
+ platform: linux/amd64
+ build:
+ context: .
+ dockerfile: src/Werkr.Server/Dockerfile
+ args:
+ BUILD_MODE: ${BUILD_MODE:-source}
+ restart: unless-stopped
+ depends_on:
+ werkr-api:
+ condition: service_healthy
+ werkr-agent:
+ condition: service_healthy
+ environment:
+ ASPNETCORE_ENVIRONMENT: Development
+ DOTNET_ENVIRONMENT: Development
+ ASPNETCORE_URLS: "https://+:8443"
+ ASPNETCORE_Kestrel__Certificates__Default__Path: /app/certs/werkr-server.pfx
+ ASPNETCORE_Kestrel__Certificates__Default__Password: werkr-dev
+ SSL_CERT_FILE: /app/certs/werkr-ca.pem
+ ConnectionStrings__werkrdb: "Host=postgres;Port=5432;Database=werkrdb;Username=werkr;Password=werkr_dev_password"
+ services__api__https__0: "https://werkr-api:8443"
+ services__agent__https__0: "https://werkr-agent:8443"
+ WERKR_CONFIG_PATH: /app/config
+ volumes:
+ - server-config:/app/config
+ - server-keys:/app/keys
+ - ./certs/werkr-server.pfx:/app/certs/werkr-server.pfx:ro
+ - ./certs/werkr-ca.pem:/app/certs/werkr-ca.pem:ro
+ ports:
+ - "5050:8443"
+ healthcheck:
+ test: ["CMD", "curl", "--cacert", "/app/certs/werkr-ca.pem", "-f", "https://localhost:8443/health"]
+ interval: 10s
+ timeout: 10s
+ retries: 10
+ start_period: 45s
+
+volumes:
+ pgdata:
+ agent-data:
+ agent-config:
+ api-config:
+ server-config:
+ server-keys:
diff --git a/docs/1.0-Target-Featureset.md b/docs/1.0-Target-Featureset.md
new file mode 100644
index 0000000..1fe84eb
--- /dev/null
+++ b/docs/1.0-Target-Featureset.md
@@ -0,0 +1,1445 @@
+# Design Specification: Werkr 1.0 Target Featureset
+
+## 1. Vision & Audience
+
+**Document Purpose** — This document is the definitive 1.0 featureset declaration for the Werkr platform. It describes the complete set of capabilities that must be implemented before the platform is designated as version 1.0. It is not a roadmap, architecture specification, or implementation guide — it is a customer-facing statement of what the 1.0 release will deliver. Additional features may be shipped in pre-release versions or added beyond this set, but the 1.0 milestone is not reached until every definition in this document is met. The document uses descriptive prose rather than RFC-style normative language (MUST/SHOULD/MAY) to maintain accessibility for both technical and non-technical readers.
+
+Werkr is a **self-hosted workflow orchestration platform** for automating operational tasks across Windows, Linux, and macOS. It targets two audiences:
+
+- **DevOps / Platform engineers** — code-first workflow definition via JSON, API-driven automation, full environment replication across instances, powerful debugging and re-execution tooling, CI/CD integration.
+- **IT operations / Business users** — intuitive visual workflow builder, real-time run monitoring, role-based access, manual approval gates, and a UI that makes building, debugging, and managing workflows immediate and approachable.
+
+### Core Tenets
+
+- **Dual-mode interaction** — visual drag-and-drop editor and JSON import/export. Both audiences work in one platform without compromise.
+- **Enterprise security without enterprise cost** — encrypted gRPC, granular RBAC, TOTP 2FA, WebAuthn passkeys, database encryption at rest, centralized credential management. Open-source and MIT-licensed.
+- **Native distributed agent model** — agents execute tasks on remote hosts with end-to-end encrypted communication. Agents receive all notifications and state changes via gRPC push. No application-level polling for state changes, no third-party message brokers. Agent heartbeats use a lightweight periodic check-in; infrastructure health endpoints (`/health` [Server/Api], `/alive` [Agent]) are available for external monitoring but are not part of the application state synchronization model.
+- **Self-hosted, zero vendor lock-in** — runs on customer infrastructure with PostgreSQL or SQLite. No cloud dependency, no mandatory telemetry.
+- **Event-driven automation** — schedules, cron expressions, file monitors, API triggers, and workflow-completion triggers from a unified trigger registry.
+- **Extensible architecture** — modular agent design, hierarchical permission model, independent trigger registry, and versioned APIs ensure the platform grows without breaking existing deployments.
+
+### Architectural Design Principles
+
+The 1.0 architecture ensures that all public contracts (APIs, schemas, protos, permission model, agent binary) support additive evolution. New capabilities — new tables, new endpoints, new gRPC services, new UI pages — can be introduced in future releases with limited risk of semver-breaking changes.
+
+Specific extensibility foundations:
+- Versioned REST API with additive evolution rules.
+- Additive schema evolution for database and JSON export formats.
+- Extensible permission model with hierarchical naming conventions.
+- Modular agent architecture with capability registration.
+- Schema-versioned JSON export with typed, extensible sections.
+
+The 1.0 deployment model is single-tenant — one deployment serves one organization.
+
+Server-side extensibility is achieved through composable registries — trigger types, permissions, notification channels, retention rules, and API endpoint domains are independently registered at application startup.
+
+**System Architecture** — Werkr uses a three-tier architecture: **Werkr.Server** (web UI, identity provider), **Werkr.Api** (REST API, gRPC host, application logic), and **Werkr.Agent** (remote execution, schedule evaluation). Server → API via REST; API → Agents via encrypted gRPC. No direct Server-Agent communication.
+
+---
+
+## 2. Glossary
+
+| Term | Definition |
+|------|------------|
+| **Task** | A reusable unit of work definition — a named configuration that specifies what to execute (action handler, script, or command), its parameters, and its runtime constraints. Tasks exist independently of workflows. |
+| **Step** | A node in a workflow DAG that references a task. A step binds a task to a specific position in the workflow graph, adding dependency declarations, variable bindings, error handling strategy, targeting configuration, and optional approval gate configuration. |
+| **Workflow** | A directed acyclic graph (DAG) of steps with dependency edges defining execution order. A workflow is a versioned, reusable automation definition. |
+| **Run** | A single execution instance of a workflow at a specific version. A run tracks the state and outputs of every step from start to terminal status. Also referred to as a workflow run. |
+| **Job** | A single execution of a task on an agent. Each step in a run produces one job (or multiple jobs if retried). |
+| **Trigger** | A configured event source that initiates a workflow run — schedules, file events, API calls, manual invocation, or workflow completion. |
+| **Schedule** | A time-based trigger configuration — cron expressions, intervals, or fixed date/time values with optional holiday calendar references. |
+| **Agent** | A remote execution host running the Werkr.Agent process. Agents execute jobs, evaluate schedule triggers, and communicate with the API via encrypted gRPC. |
+| **Module** | A self-contained functional package for the agent that registers its own gRPC services, background tasks, configuration handlers, and database tables. Modules are classified as **built-in** (always active) or **extension** (optional). |
+| **DAG** | Directed Acyclic Graph — the execution graph structure of a workflow. Nodes are steps; edges are dependency relationships. |
+| **Action Handler** | A built-in, code-free automation primitive (e.g., Copy File, HTTP Request, Send Email) that a task can invoke without requiring a script. |
+| **Composite Node** | A DAG node that encapsulates a nested child workflow (e.g., While, Do, ForEach loops, Switch conditional branching). The outer DAG sees one node; the inner child workflow executes within the composite node's scope. Composite nodes are visually rendered as single expandable nodes in the DAG editor with navigation between inner and outer DAG views. |
+| **Calendar** | A named configuration that defines working days and holidays for schedule suppression and business-day calculations (see §4 Schedule Configuration). |
+| **Holiday Rule** | A fixed date or recurring date pattern within a calendar that defines non-working days (see §4 Schedule Configuration). |
+| **Business Day** | Any day that matches a calendar's working-day pattern and is not a holiday (see §4 Schedule Configuration). |
+| **Capacity Unit** | One actively executing workflow task on an agent, used as the unit of measure for agent concurrency limits. Background operations do not consume capacity units (see §10 Resource Management). |
+| **Trigger Context** | Event-source data injected as workflow input variables when a trigger fires (see §4 Trigger Context). |
+| **Approval Gate** | A step configuration that pauses workflow execution and requires explicit human approval before the step proceeds (see §5 Manual Approval Gates). |
+| **Variable** | A named value accessible within a workflow run. Variables are scoped to namespaces (eg: step, workflow, trigger, system) and support string, number, boolean, null, and collection types (see §5 Workflow Variables). |
+| **Error Handler** | A designated task that executes when its owning step fails, providing remediation logic before the step is marked as recovered or the workflow fails (see §3 Error Handler Steps). |
+| **Dependency Mode** | A configuration on a step that determines which upstream step outcomes allow the step to proceed (see §5 Dependency Modes). |
+| **Retention Policy** | A configurable time-based rule that governs automatic deletion of historical data per entity type (see §12 Data Management & Retention). |
+| **Notification Channel** | A configured delivery mechanism (Email, Webhook, In-App) through which the platform sends event notifications (see §8 Notifications). |
+| **Expression** | A typed condition statement composed of literals, variable references, comparisons, and logical operators, used in branching and loop constructs (see §5 Expression Language). |
+| **Correlation ID** | A user-defined identifier attached to a workflow run for cross-system traceability (see §5 Correlation IDs). |
+| **Server** | The Werkr.Server component — a Blazor Server web application that provides the UI and identity provider (see §10 Three-Tier Topology). |
+| **API** | The Werkr.Api component — the REST API and gRPC host that manages application logic, workflow orchestration, and agent communication (see §10 Three-Tier Topology). |
+| **Execution** | The act of running a task on an agent. A single step execution encompasses the full lifecycle from agent dispatch through terminal state, including any error handler invocations and retry attempts. |
+| **Dispatch** | The act of assigning a queued step to a specific agent for execution. Dispatch occurs when an agent with matching capabilities and available capacity is identified. |
+| **Re-Execution** | Resuming or replaying a previously completed or failed workflow run. Includes both "retry from failed step" (preserving completed outputs) and "replay" (full re-run from the beginning). See §5 Re-Execution and Replay Mode. |
+
+---
+
+## 3. Task Engine
+
+The task engine defines, stores, validates, and executes individual units of work on agents.
+
+### Task Management
+
+- **Task CRUD** — create, edit, delete, and clone tasks with configurable task-level validity windows (start date, end date) that define when the task definition is active, and maximum run durations. Validity windows are evaluated at step dispatch time. A step referencing a task outside its validity window fails with a validation error.
+- **Five task types** — Action (built-in handlers, no scripting required), PowerShell Script, PowerShell Command, Shell Script, Shell Command. *Script* task types (PowerShell Script, Shell Script) reference an executable file on disk. *Command* task types (PowerShell Command, Shell Command) are file-less, typically single-line inline executions.
+- **Task validation** — malformed task definitions are rejected with specific validation errors at save time.
+- **Task output handling** — exit codes, output previews, and full log retrieval for every execution.
+- **Maximum run duration enforcement** — tasks exceeding their configured time limit (default: 1 hour, configurable per task) are terminated with an appropriate status and audit log entry.
+
+### Task Versioning
+
+- Immutable task versions created on each task save.
+- Version history browsable in the UI.
+- Steps in a workflow version reference a specific task version (snapshot binding). Editing a task creates a new version; existing workflow versions continue to reference their originally bound task version.
+- Task version diffs are computed on-demand for comparison.
+
+### Built-in Action Handlers
+
+The following built-in action handlers enable common automation without writing scripts:
+
+| Category | Actions |
+|----------|---------|
+| **File operations** | Copy, Move, Rename, Create, Delete, Write Content, Clear Content, Test Exists, Get Info, Read Content, Find and Replace |
+| **Directory operations** | Create Directory, List Directory |
+| **Process management** | Start Process, Stop Process |
+| **Network & integration** | Test Connectivity, HTTP Request, Send Email, Send Webhook, File Download, File Upload |
+| **Archive** | Archive (compress), Extract (decompress) |
+| **Data** | JSON Manipulation |
+| **Control flow** | Delay |
+| **File monitoring** | Wait for File Event **(step-level, blocking)** — a step-level blocking action that watches a directory within a step's execution and completes when a matching file event occurs or times out. Watched paths are validated against the agent's path allowlist. The Wait for File Event action handler is distinct from the File Monitor trigger type (§4), which is a persistent trigger initiating new runs. |
+| **OS service management** | Start Service, Stop Service, Restart Service, Query Service Status (Windows Services, Linux systemd, macOS launchd) |
+
+### Composite Node Types
+
+Four composite node types provide iteration, looping, and conditional control flow. Unlike action handlers, composite nodes encapsulate a nested child workflow rather than performing a single operation. Each composite node's body is implemented as a **separate child workflow** that is only visible in the UI in the context of its parent workflow:
+
+| Type | Behavior |
+|------|----------|
+| **ForEach** | Iterates over a collection variable, executing the body child workflow once per element. Supports sequential (default) and parallel execution modes, configurable per node. |
+| **While** | Evaluates a condition expression before each iteration; continues while the condition is true. |
+| **Do** | Evaluates a condition expression after each iteration; always executes at least once. |
+| **Switch** | Evaluates an expression against an ordered list of case conditions; routes execution to exactly one matching case branch. Each case contains its own child workflow. See §5 Switch Composite Node. |
+
+Composite node execution semantics are defined in §5 Composite Node Execution Model.
+
+### Enhanced HTTP Request Action
+
+The HTTP Request action supports:
+
+- Configurable HTTP methods (GET, POST, PUT, PATCH, DELETE, HEAD, OPTIONS).
+- Custom headers.
+- Authentication methods: Basic, Bearer token, API key.
+- Request body configuration.
+- Response body capture to workflow variables.
+- Status code routing — configurable status code ranges map to step outcomes (success, failure, or specific error categories). Responses outside configured success ranges fail the step, subject to the step's error handling strategy.
+- **Retry-After handling** — `Retry-After` headers in target responses are respected by default. A configurable maximum wait per retry controls how long the platform honors a retry-after delay. This per-retry maximum is also bounded by the task's maximum run duration. A configurable maximum retry count limits the total number of retry attempts for the HTTP request.
+- HTTP-level retry (Retry-After handling and maximum retry count) operates within a single step execution and is independent of step-level retry policies, which govern re-execution of the entire step.
+
+### OS-Specific Service Management
+
+- **Windows Service management** — start, stop, restart, and query status of Windows Services.
+- **Linux systemd unit management** — start, stop, restart, and query status of systemd units.
+- **macOS launchd unit management** — start, stop, restart, and query status of launchd daemons and agents.
+
+### PowerShell Runtime
+
+Embedded PowerShell host with full output stream capture:
+
+- Standard output (stdout) and standard error (stderr).
+- Verbose, Warning, Debug, and Information streams.
+- Script-level parameter passing via task configuration.
+- Exit code capture and evaluation.
+- Cross-platform PowerShell Core support.
+
+### Shell Execution
+
+- Native OS shell invocation with a configurable shell per agent. Defaults: cmd.exe on Windows, /bin/sh on Linux & macOS.
+- Exit code capture.
+- Environment variable injection.
+- Working directory configuration. Working directories are validated against the agent's path allowlist.
+- **Variable escaping** — workflow variables are escaped or encoded appropriately for the receiving execution context before interpolation. For shell commands, variables are escaped according to the target shell's quoting rules. For action handler parameters that accept user-defined strings, values are encoded appropriately for the target context (e.g., file paths, HTTP headers). Where the execution model supports it (PowerShell parameters, process arguments), variables are passed as discrete arguments rather than interpolated into command strings.
+
+### Sensitive Data Redaction
+
+- Configurable regex patterns for automatic masking of sensitive data (passwords, tokens, connection strings, API keys) in execution log output.
+- Redaction applies to stored output, output previews, and real-time log streaming.
+- Default redaction patterns ship with the platform. Administrators can add custom patterns.
+- Custom regex patterns are validated at save time. Patterns that fail compilation or exceed a complexity threshold are rejected. The complexity threshold limits pattern compilation time (default: 1 second). Patterns exceeding the limit are rejected.
+- Redacted values are replaced with a consistent marker (e.g., `[REDACTED]`).
+- **Redaction order** — variable-level redaction flags (see §5 Workflow Variables) are applied first. Regex-based patterns are applied afterward to catch any remaining sensitive values not covered by explicit flags.
+
+### Step-Level Error Handling
+
+Each workflow step supports a configurable error handling strategy that determines what happens after a failure. Retry is a separate, always-available configuration on any step. For all strategies except "Remediate Before Retry," the strategy determines the outcome after retries are exhausted (or immediately if no retry policy is configured). The "Remediate Before Retry" strategy executes the error handler before retry attempts begin.
+
+| Strategy | Behavior |
+|----------|----------|
+| **Fail Workflow** | Step failure fails the entire workflow (default). |
+| **Skip** | Mark step as skipped; continue to the next step. |
+| **Continue** | Mark step as failed; continue workflow execution to non-dependent downstream steps. |
+| **Run Error Handler** | Exhaust retry attempts (if configured), then execute a designated error handler step. If the handler succeeds, the step is marked as recovered. If the handler fails, the workflow fails. |
+| **Remediate Before Retry** | Execute a designated error handler step immediately on failure, before any retry attempts. If the handler succeeds and no retry policy is configured, the step is marked as recovered. If retries are configured, the handler runs before each retry attempt. See "Remediate Before Retry Behavior" below. |
+
+**Retry and Error Handler Interaction**
+
+Retry policies and error handling strategies interact according to the following rules:
+
+| Retries Exhausted? | Strategy | Error Handler Result | Outcome |
+|---|---|---|---|
+| Yes (or no retry configured) | Fail Workflow | N/A | Workflow fails |
+| Yes (or no retry configured) | Skip | N/A | Step skipped, workflow continues |
+| Yes (or no retry configured) | Continue | N/A | Step marked failed, continue to non-dependent steps |
+| Yes (or no retry configured) | Run Error Handler | Pass | Step marked recovered, workflow continues |
+| Yes (or no retry configured) | Run Error Handler | Fail | Workflow fails |
+| N/A (before retry) | Remediate Before Retry | Pass (no retry configured) | Step marked recovered, workflow continues |
+| N/A (before retry) | Remediate Before Retry | Pass (retry configured) | Step retried per retry policy |
+| N/A (before retry) | Remediate Before Retry | Fail | Workflow fails, no retries attempted |
+
+**Remediate Before Retry Behavior**
+
+When a step uses the "Remediate Before Retry" strategy, the error handler acts as a remediation step that runs before each retry attempt. The error handler executes on every failure — both the initial failure and each subsequent retry failure. This enables remediation patterns such as clearing lock files, resetting connections, or restoring prerequisites before each re-execution.
+
+If the error handler fails at any point, the workflow fails immediately and no further retries are attempted.
+
+**Step State During Error Handling**
+
+During error handler execution and retry cycles, the step's state remains `Running`. The `Failed` terminal state is assigned only after all error handling and retry logic has been exhausted. This prevents dependency modes (e.g., Any Failed) from triggering prematurely while error recovery is still in progress.
+
+### Error Handler Steps
+
+Error handler steps are regular tasks designated by reference from the owning step's configuration:
+
+- **Visibility** — error handler steps are associated with their owning step and revealed in the DAG when the owning step is selected. They are visually distinguished from normal execution path steps.
+- **No dependencies** — error handler steps have no DAG dependencies other than the failure of their owning step. They may consume pre-configured environment values and workflow variables.
+- **Output** — error handler steps produce a simple pass/fail result. The pass/fail output determines whether the owning step is marked as recovered (pass) or the workflow fails (fail).
+- **Failure behavior** — if the error handler step itself fails, the workflow fails.
+- **Variable access** — error handler steps can read workflow variables (including outputs of previously completed steps) but do not produce outputs consumed by downstream steps.
+- **No nesting** — error handler steps cannot have their own error handlers. Error handling configuration is defined at the step level, not on the error handler task itself.
+
+### Retry Policies
+
+| Property | Description |
+|----------|-------------|
+| Retry Count | Maximum number of retry attempts (default: 0). |
+| Backoff Strategy | Fixed, Linear, or Exponential backoff. |
+| Initial Delay | Time before first retry. |
+| Maximum Delay | Cap on backoff delay. |
+| Retry Conditions | Optional conditions for selective retry evaluated against the step's exit code, error output summary, and current attempt number. |
+
+---
+
+## 4. Scheduling & Triggers
+
+Werkr uses a unified trigger registry. All trigger types share a common definition, configuration, and management interface. Trigger *evaluation* occurs at different system layers depending on type: schedule-based and file-based triggers are evaluated on the agent; API, manual, and workflow-completion triggers are evaluated at the API.
+
+### Trigger Types
+
+| Trigger Type | Description |
+|-------------|-------------|
+| **DateTime** | Execute at a specific date and time. |
+| **Interval / Cyclical** | Daily, weekly, and monthly recurrence with configurable intervals and repeat windows. |
+| **Cron Expression** | Standard cron expression syntax for schedule definition. Both cron expressions and interval triggers can be used to define recurring schedules; they are independent trigger types and conversion between them is not supported. |
+| **File Monitor** **(persistent trigger)** | A persistent trigger that watches a directory and initiates a new workflow run when files matching a pattern are created or modified. Active regardless of whether any workflow is currently running. |
+| **API** | Trigger execution via an authenticated REST API call with payload parameters injected as workflow variables. |
+| **Workflow Completion** | Trigger execution when a specified workflow reaches a terminal state. Configurable to fire on success, failure, or any completion. The triggering workflow's run metadata is available as trigger context. |
+| **Manual** | Execute on demand from the user interface or API. |
+
+Trigger types are registered independently. The registry design supports adding new types without modifying existing implementations.
+
+### Trigger Context
+
+When a trigger fires, context data from the trigger source is injected into the workflow run as input variables:
+
+- **File monitor triggers** receive the file path and event type.
+- **API triggers** receive the supplied parameters.
+- **Workflow completion triggers** receive the source workflow ID, run ID, terminal status, and any published output variables from the completed run.
+- **Cron/schedule triggers** receive the scheduled time and calendar metadata.
+- **Interval/Cyclical triggers** receive the interval period, occurrence metadata, and related scheduling information.
+- **DateTime triggers** receive the scheduled execution time.
+- **Manual triggers** receive the invoking user's identity and any user-supplied or pre-defined parameters.
+
+Referencing a trigger context variable not provided by the current trigger type resolves to null.
+
+### Schedule Configuration
+
+- **Time zone awareness** — all schedules are time zone-aware with configurable start and expiration dates. Daylight Saving Time transitions are handled correctly.
+- **Calendars** — named calendar configurations that define working days and holidays. Calendars and Holiday Rules are top-level, independently managed entities with CRUD operations. Schedules and triggers reference calendars by ID. Each calendar specifies a working-day pattern (which days of the week are working days; default: Monday through Friday) and references zero or more holiday rules. A **business day** is any day that matches the working-day pattern and is not a holiday. Multiple calendars may be defined for different organizational units or regions.
+- **Holiday rules** — fixed dates and recurring date patterns within a calendar.
+- **Schedule suppression** — schedule occurrences falling on a non-business day (as defined by the referenced calendar) are suppressed or shifted to an adjacent business day. The shift direction is configurable per schedule: next business day, previous business day, or nearest business day. For nearest business day, if the holiday is equidistant from two business days, the shift direction defined on the applicable holiday rule is used as the tiebreaker. Suppressed occurrences are audit-logged.
+- **Calendar distribution** — calendar and holiday data is synchronized to agents alongside schedule definitions via the schedule synchronization gRPC service.
+- **Multi-agent trigger evaluation** — tags are designed for multi-agent targeting. When a workflow's target tags match multiple agents, each matched agent evaluates schedule-based and file-based triggers independently. There is no cross-agent deduplication; multiple agent executions are the expected behavior of multi-agent targeting. For single-agent targeting, each agent is assigned a system-generated unique agent tag (e.g., `agent:{agent-id}`) at registration time. System-generated agent tags are non-editable and non-deletable by users. To target a specific agent, reference its unique agent tag.
+
+### Schedule and Trigger Versioning
+
+- Immutable schedule and trigger versions created on each save.
+- Version history browsable in the UI.
+- Workflow versions reference the schedule/trigger version in effect at the time of workflow version creation.
+- **Trigger-workflow version binding** — triggers have a version binding mode that determines which workflow version executes when the trigger fires:
+
+ - **Latest** (default) — always executes the latest workflow version. The trigger automatically tracks the most recent workflow version from the point of attachment onwards. When a new workflow version is created, triggers in Latest mode automatically reference the newest (default/non-draft) version.
+ - **Pinned** — executes a specific associated workflow version. A version mismatch warning is displayed in the UI when the associated version is not the latest. When a workflow version update would orphan an existing pinned trigger (i.e., the trigger still references a previous workflow version), the user is prompted to re-associate the trigger with the new workflow version.
+
+ On workflow save, the UI prompts to update pinned trigger bindings if triggers reference an older version.
+
+### File Monitoring Security
+
+- Monitored paths must fall within the agent's configured path allowlist.
+- Canonical path resolution prevents symbolic link and directory traversal attacks.
+- Configurable debounce window (default: 500 ms) prevents trigger flooding from rapid file system events.
+- Circuit breaker for excessive trigger rates.
+- Configurable maximum watch count per agent (default: 50) prevents resource exhaustion.
+- Trigger configuration requires elevated permissions and is audit-logged.
+
+### API Trigger Security
+
+- Authentication via API key or bearer token. The workflow ID is specified in the request body or URL parameter.
+- Configurable rate limiting per workflow. API trigger rate limits apply independently of API key rate limits; both limits are evaluated and the most restrictive applies.
+- Rate-limited callers receive an HTTP 429 response with a `Retry-After` header indicating when the next request will be accepted.
+- Request validation (optional JSON schema).
+- Payload injection as workflow input variables.
+- **Cycle detection** — the trigger registry detects circular workflow-completion chains at configuration time and surfaces a **prominent warning** in the workflow list and workflow editor UI. Circular chains are not blocked — users may intentionally create cyclical workflows. Workflow-completion trigger chains have a configurable maximum chain depth (default: 5). Each trigger-initiated run carries a chain depth counter. When max depth is reached, the trigger is suppressed with an audit log entry. Manual triggers reset the counter to 0.
+
+### API Trigger Response Contract
+
+When an API trigger fires successfully, the response includes the newly created run ID and the current run status. The response body conforms to the standard API response envelope (see §13). Callers can use the run ID to query run status via the `GET /api/v1/workflows/{workflowId}/runs/{runId}` endpoint. A dedicated run status page is available in the UI at a stable URL derived from the run ID.
+
+---
+
+## 5. Workflow Engine
+
+The workflow engine orchestrates multi-step automation as directed acyclic graphs (DAGs).
+
+### DAG Model
+
+- Workflows are directed acyclic graphs with topological ordering.
+- Steps declare dependencies on other steps.
+- Cycle detection at save time and runtime.
+- Maximum workflow step count enforcement.
+- **Per-workflow concurrent run limit** — configurable maximum concurrent runs per workflow (default: unlimited). When the limit is reached, new trigger events are queued until a running instance completes. Queued trigger events are processed in **FIFO order** and persisted to the database. Queue depth is configurable per workflow (default: 100). When the queue depth is exceeded, overflow trigger events are persisted to a dead-letter queue (DLQ) for administrative review. Overflow events are not automatically processed. Administrators can inspect, replay, or discard DLQ entries via the UI and REST API. DLQ entries are audit-logged. For API-triggered runs, the API response indicates that the event was enqueued to the DLQ rather than the primary queue. Queued triggers are visible in the UI with a wait reason.
+
+### Step and Run State Model
+
+**Step States**
+
+| State | Description |
+|-------|-------------|
+| **Pending** | Step is waiting for upstream dependencies to complete. |
+| **Queued** | Dependencies satisfied; step is queued for agent dispatch. |
+| **Waiting for Approval** | Step requires manual approval before execution proceeds. |
+| **Running** | Step is actively executing on an agent. Also the active state during error handler execution and retry cycles (see §3 Step-Level Error Handling). |
+| **Succeeded** | Step completed successfully. |
+| **Failed** | Step execution failed after exhausting all error handling and retry logic. |
+| **Skipped** | Step was skipped due to dependency mode or error handling strategy. |
+| **Cancelled** | Step was cancelled by user action or workflow-level timeout. |
+| **Recovered** | Step failed but was recovered by its error handler. |
+| **Upstream Failed** | Step was not executed because an upstream step or parallel sibling failed with the Fail Workflow strategy. Steps in this state did not begin execution. |
+
+**Step State Transitions**
+
+The following transitions are valid:
+
+| From | To | Condition |
+|------|----|-----------|
+| Pending | Queued | All upstream dependencies satisfied. |
+| Pending | Skipped | Upstream dependency mode not met (e.g., All Success with a failed upstream). |
+| Pending | Upstream Failed | Upstream step or parallel sibling failed with Fail Workflow strategy. |
+| Pending | Cancelled | User cancellation or workflow-level timeout. |
+| Queued | Running | Step dispatched to an agent for execution. |
+| Queued | Waiting for Approval | Step has an approval gate configured. |
+| Queued | Upstream Failed | Upstream step or parallel sibling failed with Fail Workflow strategy. |
+| Queued | Cancelled | User cancellation or workflow-level timeout. |
+| Waiting for Approval | Running | Approval granted; step proceeds to execution. |
+| Waiting for Approval | Failed | Approval rejected. Rejection does not trigger the step's error handler. |
+| Waiting for Approval | Upstream Failed | Parallel sibling failed with Fail Workflow strategy while step awaits approval. |
+| Waiting for Approval | Cancelled | User cancellation or workflow-level timeout. |
+| Running | Succeeded | Execution completed successfully. |
+| Running | Failed | Execution failed after exhausting all error handling and retry logic. |
+| Running | Recovered | Execution failed but error handler succeeded (after retries exhausted or no retry configured). |
+| Running | Cancelled | User cancellation or workflow-level timeout. |
+
+During error handler execution or retry cycles, the step remains in the `Running` state. The `Failed` terminal state is assigned only after all error handling and retry logic has been exhausted. Any non-terminal state may transition directly to `Cancelled`.
+
+Terminal states: Succeeded, Failed, Skipped, Cancelled, Recovered, Upstream Failed. Terminal states have no outgoing transitions.
+
+**Run States**
+
+| State | Description |
+|-------|-------------|
+| **Pending** | Run is created but has not started executing. |
+| **Queued** | Run is waiting due to per-workflow concurrent run limit. |
+| **Running** | Run is actively executing steps. |
+| **Paused** | Run execution is paused by user action. In-progress steps complete their current execution but no further steps are dispatched. Approval gate timeouts are suspended while the run is paused. |
+| **Succeeded** | All steps reached a terminal state and no step triggered the Fail Workflow strategy or an unrecovered error handler failure. |
+| **Failed** | One or more steps failed with the Fail Workflow strategy or an unrecovered error handler failure, or the workflow-level timeout was exceeded. |
+| **Cancelled** | Run was cancelled by user action. |
+
+**Run State Transitions**
+
+The following transitions are valid:
+
+| From | To | Condition |
+|------|----|-----------|
+| Pending | Queued | Per-workflow concurrent run limit reached; run enters queue. |
+| Pending | Running | Run begins executing steps (no queue wait). |
+| Pending | Cancelled | User cancellation before execution starts. |
+| Queued | Running | Concurrent run slot becomes available. |
+| Queued | Cancelled | User cancellation while queued. |
+| Running | Succeeded | All steps reached a terminal state and no step triggered the Fail Workflow strategy or an unrecovered error handler failure. |
+| Running | Failed | One or more steps failed with the Fail Workflow strategy or an unrecovered error handler failure, or workflow-level timeout exceeded. |
+| Running | Paused | User-initiated pause. |
+| Running | Cancelled | User cancellation. |
+| Paused | Running | User-initiated resume. |
+| Paused | Cancelled | User cancellation while paused. |
+
+Terminal states: Succeeded, Failed, Cancelled. Terminal states have no outgoing transitions.
+
+Step and run states are terminal once reached after all error handling and retry logic has been exhausted (Succeeded, Failed, Skipped, Cancelled, Recovered, Upstream Failed for steps; Succeeded, Failed, Cancelled for runs). A failed run may be re-executed from the point of failure (see §5 Re-Execution).
+
+### Dependency Modes
+
+| Dependency Mode | Behavior |
+|----------------|----------|
+| **All Success** | Proceed only if all upstream steps succeeded. A Recovered step satisfies All Success. Skipped and Upstream Failed steps do not satisfy this mode. |
+| **Any Success** | Proceed if at least one upstream step succeeded. Recovered steps count as success. Skipped and Upstream Failed steps do not count as success. |
+| **All Complete** | Proceed when all upstream steps ran to completion (Succeeded, Failed, or Recovered), regardless of success or failure. Skipped, Upstream Failed, and Cancelled steps do not satisfy this mode — they did not run to completion. |
+| **Any Complete** | Proceed when any upstream step ran to completion (Succeeded, Failed, or Recovered), regardless of success or failure. Skipped, Upstream Failed, and Cancelled steps do not satisfy this mode — they did not run to completion. |
+| **Any Failed** | Proceeds when any upstream step reaches a **terminal** Failed state (after error handling and retries are exhausted). Steps whose failure is recovered do not trigger this mode. Skipped and Upstream Failed do not trigger this mode. Remaining in-progress upstream steps continue to completion independently. |
+
+The default dependency mode is **All Success**.
+
+A step marked Failed via the Continue error handling strategy triggers the Any Failed dependency mode on downstream steps.
+
+A Recovered step is semantically equivalent to a successful step for dependency evaluation purposes. Recovered satisfies All Success and Any Success modes. However, if a retry policy is configured on the recovered step, the retry re-evaluates the step through the normal Running → Succeeded flow. Functionally, only a step that is recovered *without* a subsequent successful retry remains in the Recovered terminal state — a successful retry transitions the step to Succeeded.
+
+### Branching & Conditionals
+
+- Control statements: Switch, While, Do, ForEach — all implemented as Composite Node types (see §5 Composite Node Execution Model).
+- Condition evaluator with expression language.
+- Visual condition builder with structured and raw/advanced expression modes.
+- Boolean operators: AND, OR, NOT.
+- Comparison operators: equals, not-equals, contains, matches, greater-than, less-than, greater-than-or-equal, less-than-or-equal.
+- Variable references in conditions.
+
+### Conditional Branching
+
+All conditional branching is handled by the Switch composite node. There are no conditional edges in the outer DAG — branching logic is encapsulated within Switch nodes, keeping the outer DAG unconditional and straightforward.
+
+### Switch Composite Node
+
+A Switch node is a composite node that evaluates an expression and routes execution to exactly one named case branch. Switch handles all conditional branching scenarios, from simple binary decisions (if/else) to multi-way routing (else-if chains):
+
+- **Case evaluation** — the Switch node evaluates a single expression against an ordered list of case conditions. The first case whose condition matches activates that case's branch. If no case matches and a default branch is defined, the default branch activates. If no case matches and no default branch is defined, the Switch node completes as a no-op — no child workflow executes, and the composite node succeeds with no output. A default branch is optional.
+- **Case branches** — each case contains a child workflow that executes independently. Only the activated case's child workflow runs; non-activated cases do not instantiate run records.
+- **Simple if/else** — a Switch with one named case (the "if" condition) and a default case (the "else" branch) is functionally equivalent to if/else. A Switch with a single case and no meaningful default body is functionally equivalent to a standalone if.
+- **Single-node encapsulation** — like other composite nodes, the outer DAG sees the Switch as a single node. The activated case's child workflow executes within the composite node boundary.
+- **Execution model** — the Switch node step remains in the `Running` state while the activated case's child workflow executes. When the child workflow completes, the Switch node transitions to a terminal state based on the child's outcome.
+- **Output variable mapping** — the Switch node declares output variables that are promoted from the activated case's child workflow to the parent workflow scope. All cases must declare the same output variable schema so downstream steps can consume outputs regardless of which case executed.
+- **Error handling** — if the activated case's child workflow fails, the Switch node fails. The Switch node's own error handling strategy (configured on the composite node as a step in the outer DAG) then determines the workflow-level outcome.
+- **Visualization** — the Switch node renders as a single expandable node in the DAG editor. Expanding it reveals the list of cases with their conditions. Selecting a case navigates to that case's child workflow DAG view.
+- **Nesting** — Switch nodes follow the same nesting rules as other composite nodes.
+
+### Expression Language
+
+Condition expressions used in branching, While/Do loops, and retry conditions use a typed expression language:
+
+- **Types** — string, number, boolean, null. No implicit type coercion; operands of mismatched types produce an evaluation error.
+- **Null handling** — null equals null. Any comparison between null and a non-null value evaluates to false.
+- **Operator precedence** — NOT > comparison > AND > OR. Parentheses override default precedence.
+- **`matches` operator** — performs a full regex match against the operand (anchored; the entire value must match the pattern).
+- **String comparison** — case-insensitive by default. A per-expression flag enables case-sensitive comparison.
+- **Error behavior** — a malformed or invalid expression produces an evaluation error that fails the step, subject to the step's configured error handling strategy.
+- **Complexity limit** — configurable maximum expression depth (default: 10) prevents excessively nested expressions.
+- **Formal grammar** — Expressions are composed of: literal values (string in double quotes, number, boolean `true`/`false`, `null`), variable references (`{{namespace.name}}`), comparison expressions, logical expressions (`AND`/`OR`/`NOT`), and parenthesized groups. Operator precedence: parentheses > NOT > comparison > AND > OR.
+
+### Composite Node Execution Model
+
+While, Do, ForEach, and Switch control-flow constructs are implemented as **composite nodes** — each is a single node in the outer DAG that encapsulates a nested child workflow:
+
+- **Single-node encapsulation** — the outer workflow DAG sees one node. Cycle detection operates on the outer graph and is not violated by internal repetition or branching within the composite node.
+- **Child workflow implementation** — the composite node's body is a separate child workflow that is only visible in the UI associated with its parent workflow. The parent workflow step references the child workflow by ID. Execution is initiated via an internal workflow trigger; an event fires when the child workflow completes, allowing the parent workflow to continue. The composite node step remains in the `Running` state while the child workflow executes.
+- **Iteration execution** — each iteration produces a separate execution record for traceability and output inspection.
+- **MaxIterations guard** — a configurable maximum iteration count (default: 100) prevents runaway loops. Exceeding the limit fails the composite node.
+- **Input variable mapping** — composite nodes declare input variables mapped from the parent workflow's variable scope. The child workflow receives these as its initial workflow variables. Loop iteration variables (current item in ForEach, iteration index) are injected as input variables per iteration. Switch nodes map parent variables into the activated case's child workflow.
+- **Output variable mapping** — composite nodes declare output variables that are promoted back to the parent workflow scope on successful completion of the child workflow. For ForEach nodes, output variables from all iterations are accumulated into a collection — each iteration contributes its output to the collection rather than overwriting previous iterations. The output collection is ordered by iteration index (original collection order), regardless of completion order. For Switch nodes, all cases must declare the same output variable schema so downstream steps can consume outputs regardless of which case executed.
+- **Variable scoping** — child workflow steps have read access to mapped parent workflow variables via the input variable mapping. Loop iteration variables are scoped to the iteration and do not leak to the outer workflow. Each parent→child composite node relationship has independent variable mappings. Nested composite nodes (grandchildren) cannot access grandparent workflow variables unless those variables are explicitly mapped through the intermediate child workflow.
+- **ForEach** — iterates over a collection variable, executing the body child workflow once per element. Supports **sequential** (default) and **parallel** execution modes, configurable per composite node. Parallel mode has a configurable maximum parallelism per node (default: 5) that bounds the number of concurrent iterations.
+- **While / Do** — evaluate a condition expression per iteration. While evaluates before each iteration; Do evaluates after.
+- **Switch** — evaluates an expression against case conditions and executes the matching case's child workflow. See §5 Switch Composite Node for full semantics.
+- **Downstream dependencies** — steps that depend on a composite node wait for the composite node to complete before proceeding (all iterations for loop nodes, or the activated case for Switch nodes).
+- **Visualization** — composite nodes render as a single expandable node in the DAG editor and read-only views. The UI supports navigation between the outer DAG and the inner child workflow's DAG view.
+- **Error atomicity** — for loop nodes (ForEach, While, Do): if any iteration fails (after exhausting the iteration's error handling), the composite node fails. In parallel mode, currently in-flight iteration tasks complete their execution; iterations not yet started are cancelled. In sequential mode, remaining iterations are cancelled. For Switch nodes: if the activated case's child workflow fails, the Switch node fails. The composite node's own error handling strategy is then evaluated.
+- **Partial output on failure** — when a ForEach node fails during parallel execution, output variables accumulated from completed iterations are discarded. The composite node's error handler (if configured) does not receive partial iteration outputs. Only fully completed composite node executions produce output variable collections.
+- **Parallel variable isolation** — when ForEach executes iterations in parallel, each iteration receives an atomic copy of workflow variables at iteration start. Cross-iteration variable mutation is not supported; iterations are isolated.
+- **Nesting** — composite nodes may be nested (a composite node's body may contain other composite nodes). There is no hard depth limit. The UI displays a warning at nesting depths greater than 2 to discourage excessive complexity.
+- **Timeout inheritance** — composite nodes share their parent workflow's timeout. The workflow-level timeout clock runs continuously across all composite node iterations and child workflow executions. For example, if a composite node begins execution 15 minutes into a 24-hour workflow timeout, the composite node and all its iterations or cases have the remaining 23 hours and 45 minutes to complete. Composite nodes do not have independent timeout configurations. Approval gates within composite node child workflows are subject to the parent workflow's timeout.
+
+### Composite Node Error Handling
+
+Each step within a composite node's body has its own error handling strategy and retry policy, evaluated independently per iteration (for loop nodes) or per case execution (for Switch nodes). If a body step's error handling ultimately produces a failure (e.g., Fail Workflow strategy or error handler failure), the composite node fails. If a body step uses the "Continue" error handling strategy (marked failed, workflow continues to non-dependent steps) and the body eventually completes, the execution is considered successful from the composite node's perspective — the step-level failure does not propagate to the composite node. Only explicit failures (Fail Workflow strategy or error handler failure) fail the composite node. The composite node's own error handling strategy (configured on the composite node as a step in the outer DAG) then determines the workflow-level outcome.
+
+### Composite Node Cancellation and Pause
+
+- **User-initiated pause** — if the inner (child) workflow steps are paused, the outer (parent) workflow also pauses. If the parent workflow is paused, the inner workflow also pauses.
+- **User-initiated cancellation** — if the outer workflow is cancelled, the inner workflow also receives the cancellation signal. Currently running tasks within the inner workflow complete their execution, but no further steps are dispatched. Remaining steps within the inner workflow enter the Cancelled state. Downstream steps from the composite node in the parent workflow also enter the Cancelled state.
+
+### Composite Node Serialization
+
+Composite nodes serialize as references to their child workflows. For Switch nodes, each case's child workflow is serialized as a separate reference. JSON export (see §6) includes parent and all child workflows as individual JSON objects within the export document. Import resolves child workflow references during referential integrity validation.
+
+### Composite Node Re-Execution
+
+Re-execution of a failed composite node re-executes from the beginning (iteration 1 for ForEach, first evaluation for While/Do, case re-evaluation for Switch). Completed outputs from the prior execution are not preserved. The composite node is treated as a single step for re-execution purposes.
+
+### Composite Node Child Workflow Versioning
+
+Child workflows are version-bound to the parent. When a parent version is created, child workflow state is included in the snapshot (including all case child workflows for Switch nodes). Rolling back restores child definitions from that snapshot. Child workflows are not independently versioned or accessible outside their parent context.
+
+### Parallel Execution
+
+- Independent steps at the same topological level execute concurrently.
+- True parallelism, not sequential simulation.
+- Agent capacity respected for concurrent task dispatch.
+- **Fail Workflow during parallel execution** — when a step fails with the Fail Workflow strategy during parallel execution, steps already in progress (including any currently executing error handlers) are allowed to finish their current execution to prevent unrecoverable state errors. In-flight steps may transition to Succeeded, Recovered, or Failed based on their own execution outcome, but no additional retry attempts are initiated regardless of retry policy configuration. No further action occurs for these steps beyond reaching a terminal state. Steps not yet started (including steps waiting for approval in parallel branches) enter the `Upstream Failed` state. No further downstream steps are dispatched. The workflow enters a failed state once all in-progress steps finish.
+
+### Workflow Variables
+
+- Inter-step data passing via named variables.
+- Variable scopes: workflow, step, trigger, system.
+- Output capture from completed steps.
+- Variable reference syntax: `{{namespace.path}}` where namespace is one of `step`, `workflow`, `trigger`, or `system`. Examples: `{{step.StepName.output}}`, `{{workflow.input.paramName}}`, `{{trigger.file_path}}`, `{{system.timestamp}}`.
+- Namespace support: `step`, `workflow`, `trigger`, `system`.
+- **Resolution order** — All variables must be accessed by namespace explicitly.
+- **Reserved namespace words** — `step`, `workflow`, `trigger`, and `system` are reserved and cannot be used as step or variable names. Variable syntax uses explicit namespace prefixes, avoiding collisions between user-defined names and platform namespaces.
+- **Circular reference detection** — the resolver detects circular variable references and produces an evaluation error.
+- **Maximum resolution depth** — configurable (default: 10 nested references).
+- **Escape syntax** — `\{{` outputs a literal `{{` without variable resolution.
+- The variable resolution system uses a provider-based chain. Built-in providers resolve step outputs, workflow inputs, trigger context, and system values. Providers are registered during application startup via dependency injection (DI).
+- Configurable maximum variable value size (default: 1 MB) to prevent unbounded growth.
+- **Log-redaction flag** — workflow variables can be flagged as "redact from logs." Variables with this flag have their resolved values automatically replaced with `[REDACTED]` in all execution output, output previews, and real-time log streaming. This complements regex-based redaction by proactively redacting flagged variables regardless of pattern matching.
+- **Resolution timing** — variables available at workflow start (workflow inputs, trigger context, system variables) are resolved eagerly at run initialization. Step output variables are resolved lazily at the point of consumption — they become available when the producing step completes. Parallel steps that start simultaneously cannot consume each other's outputs.
+- **Variable write isolation** — each step writes to its own step output namespace (`step.{NamedStep}.*`). Steps cannot write to another step's namespace. Workflow input variables are immutable after run initialization. Steps additionally declare which workflow variables they produce (see Workflow Output Parameters below); the engine maps designated step outputs to workflow-level variables on step completion.
+
+### Workflow Input Parameters
+
+Workflows declare input parameters: name, data type (string, number, boolean), required/optional, and optional default value. Manual triggers prompt for inputs. API triggers validate against declared inputs. Trigger context maps to inputs by name. Undeclared trigger context is available in the `trigger` namespace only. Missing required inputs with no default produce a validation error at run initialization.
+
+### Workflow Output Parameters
+
+Workflows declare workflow-level variables alongside input parameters during workflow creation or editing. Each workflow variable specifies a name and data type (value or collection). A variable can be declared as an input parameter, an output parameter, or both — the same named variable can be initialized from trigger context (input) and published on workflow completion (output).
+
+Steps declare which workflow variables they **produce** (write to) and which they **consume** (read from). These declarations create an explicit data flow contract at the workflow level:
+
+- **Producers** — any number of steps may declare that they write to a given workflow variable. When a producing step completes, the engine maps the step's designated output to the workflow variable. If multiple steps produce the same variable, the last writer by execution order determines the value. Step execution ordering via DAG dependencies ensures deterministic resolution.
+- **Consumers** — steps declare which workflow variables they read, referencing them via `{{workflow.output.paramName}}`. A workflow variable must be populated by at least one producing step before a consuming step executes. If the variable is not yet populated at consumption time, the consumer receives null.
+- **Inter-step data passing** — workflow variables serve as the intentional data-passing mechanism between steps. DAG dependencies ensure producers complete before consumers execute.
+- **Inter-workflow data passing** — variables marked as output parameters define the workflow's external output contract. On workflow completion, output parameter values are published as trigger context. Workflow completion triggers (§4) receive these published values, enabling data transfer between chained workflows.
+- **Step outputs remain separate** — step-level outputs (`{{step.StepName.output}}`, `{{step.StepName.exitCode}}`, `{{step.StepName.stdout}}`) remain available for conditional evaluation in step dependencies and for debugging/diagnostic purposes. Workflow variables are the mechanism for intentional data passing; step outputs are the mechanism for control flow and observability.
+
+If no producing step executed for an output parameter (e.g., all producers were skipped or cancelled), the output parameter resolves to null.
+
+### Variable Type System
+
+Workflow variables store values as: string, number, boolean, null, or collection (ordered list). Collection values are used by ForEach. Variables are serialized as JSON for storage and transport. Type mismatches in comparisons produce evaluation errors.
+
+### Collection Variables
+
+Collection variables are ordered lists (JSON arrays). Collections are produced by step outputs, input parameter declaration, or trigger context. Elements may be string, number, boolean, or null. Nested collections are not supported in 1.0. Maximum size is governed by the configurable maximum variable value size.
+
+### Step Output Capture
+
+Action handlers define named output parameters; outputs are captured automatically. Script and command tasks produce outputs via structured markers in stdout (`##werkr[setOutput name=value]`). Exit codes are captured as `{{step.StepName.exitCode}}`. Full stdout is available as `{{step.StepName.stdout}}`.
+
+### Workflow-Triggered Execution
+
+Workflows can be initiated by the completion of other workflows through the Workflow Completion trigger type (see §4). This provides:
+
+- Event-based workflow chaining without tight coupling.
+- Trigger context includes the source workflow's run ID, workflow ID, and terminal status.
+- **Variable passing** — the triggering workflow can publish output variables that are injected as input variables into the triggered workflow's run via trigger context. This enables data transfer between chained workflows without tight coupling.
+- Triggered workflows are independent runs — no visual parent-child hierarchy in the workflow list, no fan-out. Workflow completion triggers between top-level workflows do not render cross-workflow DAG connections.
+- Composite nodes within a workflow support navigation between inner and outer DAG views in the UI (see §5 Composite Node Execution Model). This is distinct from workflow-completion chaining between top-level workflows.
+- Workflow completion triggers are listed alongside other triggers in the workflow's trigger configuration.
+
+Cross-task dependencies between workflows (a step in Workflow A depending on a step in Workflow B) are explicitly excluded from the 1.0 scope.
+
+### Workflow Versioning
+
+- Immutable versions created on each workflow save.
+- Version history browsable in the UI.
+- Each version stores a complete snapshot of the workflow definition. Version diffs are computed on-demand for the comparison UI.
+- Version comparison with side-by-side visual diff showing added, removed, and changed steps and connections.
+- Rollback to any previous version (creates a new version with the restored content).
+- Multiple versions can have active runs concurrently.
+- Each workflow run records the version that was executed.
+- **Concurrent editing** — concurrent editing uses optimistic concurrency with conflict detection. The second save detects the version conflict; the user may overwrite, reload and re-apply, or export their version as JSON for comparison. Automatic merging is not performed.
+
+### Workflow Enabled/Disabled State
+
+Workflows have an enabled/disabled flag (default: enabled). Disabling a workflow: (a) prevents new trigger-initiated run executions (trigger evaluation and schedule definitions remain active — schedules associated with multiple workflows continue operating for their other associations), (b) pauses new steps in active runs (in-progress steps complete their current execution; no new steps are dispatched), (c) prevents manual execution. Re-enabling resumes paused active runs and permits new trigger-initiated and manual executions. Triggers that occurred while paused are discarded. Approval gate timeouts are suspended while the workflow is disabled, consistent with pause behavior.
+
+### Workflow Tags
+
+- Assign tags for organization and filtering.
+- Tags serve as the primary agent targeting mechanism.
+- Tag-based notification subscriptions.
+
+### Workflow Targeting
+
+Workflows and individual steps specify which agents should execute them:
+
+- **Tag-based targeting** — workflows and steps declare target tags. Agents with matching tags are eligible for execution. Tag matching uses case-insensitive set intersection.
+- **Capacity awareness** — when all matched agents are at capacity or offline, queued work waits. The wait reason is visible in the UI.
+- **Targeting inheritance** — steps without explicit targeting configuration inherit the workflow-level targeting. Step-level targeting overrides workflow-level targeting entirely (no merge).
+
+The targeting system uses a strategy pattern. Tag-based resolution is the 1.0 implementation. The targeting specification is stored as a typed JSON structure with a type discriminator, enabling additional resolution strategies to be introduced without modifying existing workflow definitions.
+
+### Manual Approval Gates
+
+Steps may be configured as approval gates — the workflow pauses at the step and waits for explicit human approval before proceeding:
+
+- Designated approver roles configured per gate.
+- Approval and rejection with required comments on rejection.
+- Configurable approval timeout with automatic action (approve, reject, or fail) on expiration.
+- A centralized **Pending Actions** view aggregates all workflows awaiting approval across the platform.
+- Approval and rejection actions via both the UI and REST API.
+- All approval decisions are audit-logged with the approving user's identity and timestamp.
+- Approval notification via configured notification channels.
+- **Approval lifecycle** — when a step configured as an approval gate becomes eligible for execution, it enters the Waiting for Approval state. The step is not dispatched to an agent. Approval or rejection is submitted via the UI or REST API. On approval, the step proceeds to agent dispatch and execution. On rejection, the step fails. On timeout, the configured automatic action (approve, reject, or fail) is applied. Approval state is tracked in the application database; agent restart does not affect pending approvals. On rejection, the step enters the Failed state. A rejected approval gate triggers the Any Failed dependency mode on downstream steps that declare it. Approval rejection does not invoke the step's error handler.
+
+### Workflow State Durability
+
+Running workflow state is persisted to the database:
+
+- Incomplete workflow runs are recovered on service startup.
+- Completed steps are not re-executed.
+- The step that was in-flight at the time of interruption is re-evaluated according to its error handling configuration.
+- Recovery semantics are documented and deterministic.
+
+### Execution Semantics and Idempotency
+
+The platform provides **at-least-once** execution semantics for steps interrupted during execution:
+
+- Steps that completed on the agent but whose results were not reported before interruption may re-execute on recovery.
+- Built-in action handlers document their idempotency characteristics (e.g., file create is not idempotent; file copy with overwrite is idempotent). Idempotency information is surfaced in the UI when configuring actions.
+- Users are responsible for designing custom scripts (PowerShell, shell) to be safe for re-execution where retry or recovery is configured.
+
+### Workflow-Level Timeout
+
+- Maximum total duration for a workflow run (default: 24 hours, configurable), distinct from per-task timeouts. Workflows exceeding this timeout are transitioned to Failed. The workflow-level timeout clock is suspended while the run is paused.
+- When a workflow-level timeout is reached, all in-progress steps are cancelled regardless of their individual timeout configurations. Steps waiting for approval are cancelled. The workflow enters a failed state.
+
+### Timeout Activation Rules
+
+Task maximum run duration timers and workflow-level timeout clocks are activated only when the associated entity enters the `Running` state. Time spent in `Pending`, `Queued`, or `Waiting for Approval` states does not count toward any timeout. Workflow-level timeout begins when the run transitions from `Pending` or `Queued` to `Running`. Task-level timeout begins when the step transitions from `Queued` to `Running`.
+
+### Control Precedence Rules
+
+When multiple control mechanisms interact, the following precedence rules apply:
+
+**Timeout precedence (highest to lowest):**
+1. User-initiated cancellation — immediate, overrides all timeouts.
+2. Workflow-level timeout — cancels all in-progress and queued steps.
+3. Task maximum run duration — terminates the individual step.
+4. Approval gate timeout — applies the configured automatic action (approve, reject, or fail).
+
+**Failure precedence:**
+1. User-initiated cancellation — overrides all failure handling; no error handlers execute.
+2. Workflow-level timeout — overrides step-level error handling; no error handlers execute.
+3. Fail Workflow strategy (parallel context) — in-flight steps (including in-flight error handlers) complete but do not retry; no new error handlers are initiated.
+4. Step-level error handling strategy — evaluated for the individual step's failure.
+
+**Cancellation, timeout, and pause behavior:**
+Timeouts, cancellations, and pauses are not caused by execution errors and therefore do not invoke step-level error handlers. Error handlers execute only in response to task execution failures.
+
+**Queue precedence:**
+1. Per-workflow concurrent run limit queue — evaluated first for incoming trigger events.
+2. Agent capacity queue — evaluated at step dispatch time within a running workflow.
+3. Trigger suppression (schedule suppression, rate limiting) — evaluated at trigger evaluation time before run creation.
+
+### Version Binding Precedence
+
+The following table defines which workflow version executes under each scenario:
+
+| Scenario | Workflow Version Used |
+|----------|---------------------|
+| Trigger fires (Latest mode) | Latest workflow version at time of trigger fire |
+| Trigger fires (Pinned mode) | Workflow version associated with the trigger version |
+| Manual execution | Latest workflow version (or user-selected version) |
+| Re-execution (retry from failed step) | Same version as original run (unless entity definition modified during setup, creating a new version) |
+| Replay | Same version as original run (unless entity definition modified during setup, creating a new version) |
+| Re-run with modified inputs | Same version as original run |
+| Workflow completion trigger | Determined by the triggered workflow's trigger binding mode (Latest or Pinned) |
+| Composite node child workflow | Version-bound to parent; child version is part of the parent workflow version snapshot |
+
+### Correlation IDs
+
+- Workflow runs accept a user-defined correlation ID (e.g., ticket number, order ID, deployment ID) at trigger time.
+- Correlation IDs are searchable and filterable in the run history UI.
+- Correlation IDs are exposed in the REST API and JSON export.
+
+### Re-Execution
+
+- **Retry from failed step** — resume a failed workflow run from the point of failure. Completed step outputs are preserved. The failed step and all downstream steps re-execute. The failed step re-executes as if it had never run — all error handling and retry logic applies fresh. Previously executed error handlers for the failed step are not pre-loaded. The run uses the same workflow version as the original run unless the user modifies an entity definition during re-execution setup, in which case a new workflow version is created referencing the original settings except for the changed entity reference. Downstream invalidation from structural changes (e.g., removed variables) is detected and surfaced as an error during the save process. Optionally the input variable configuration can be modified before re-execution.
+- **Re-run with modified inputs** — create a new run of the same workflow version retaining the original input variable values by default. Optionally the input variable configuration can be modified before re-execution. All steps execute from the beginning.
+
+### Replay Mode
+
+Select a completed or failed workflow run and create a replay run:
+
+- The replay run uses the workflow version recorded in the original run.
+- All steps re-execute from the beginning. No step outputs are pre-loaded or pinned from the original run.
+- The original run's input variables and trigger context are used as the starting configuration for the replay run.
+- Replay runs are flagged in the run history for traceability and linked to the original run.
+- Optionally the input variable configuration can be modified before re-execution. If only input variable **values** are modified, the existing workflow version is used. If a referenced entity **definition** is modified by the user during replay configuration (e.g., a task argument is changed), a new workflow version is created that copies the original and replaces the changed entity reference with the new version. Downstream variable invalidation from structural changes is detected and surfaced as an error during the save process.
+
+### Execution Operations
+
+- **Step-level I/O inspection** — view inputs (parameters, variables) and outputs (exit code, artifacts, stdout) for any step in a running or completed workflow.
+- **Bulk operations** — pause, resume, restart, or terminate multiple workflow runs simultaneously.
+- **Run-on-demand** — execute any workflow immediately from the UI or API.
+- **Step cancellation** — cancel an individual queued or running step within an active run. The step enters the Cancelled state (no error handling is performed). Step cancellation halts the entire workflow: all remaining non-terminal steps enter the Cancelled state. This is equivalent to cancelling the workflow from a specific step. Independent branch cancellation is not supported in 1.0.
+- **Run export** — export a completed run's execution data (step inputs, outputs, timing, status) as JSON for external analysis.
+
+### Task and Schedule Association
+
+- Link workflows to triggers/schedules and tasks directly from the workflow editor.
+
+### Inline Task and Schedule Creation
+
+- Create new tasks and triggers/schedules from within the workflow editor without navigating away.
+
+### Workflow Deletion
+
+- Workflows must be disabled before deletion. Workflows with active or running runs cannot be deleted. When a workflow is disabled, any queued trigger events for that workflow are discarded with an audit log entry.
+- Deletion is a hard delete of the workflow definition.
+- Historical run data, job output, and audit log entries associated with the deleted workflow are retained and subject to the configured retention policy. Retained run records store a snapshot of the workflow name and version at execution time. References to deleted workflows resolve to the snapshot data. Retained run data and audit log entries for deleted workflows are accessible via the REST API and audit log; they are not surfaced in the UI workflow list.
+
+---
+
+## 6. JSON Import/Export
+
+Portable JSON documents for tasks, workflows, schedules, and full environment configurations.
+
+### Export Capabilities
+
+**Entity Export**
+- Individual tasks, workflows (with steps and variable definitions), and schedules.
+- Bulk export with selection.
+- Export preserves all configuration except credentials.
+
+**Full Environment Export**
+- Tasks, workflows, schedules, roles and permission assignments, agent configuration profiles, holiday calendars, retention policies, and notification channel configurations.
+- Full environment exports include a manifest listing which sections are present and their entity counts.
+
+### Schema Design
+
+- **Schema version header** — every export document includes a version identifier for forward/backward compatibility.
+- **Typed sections** — the export format uses a section-per-entity-type structure. Each section declares its entity type.
+- **Additive extensibility** — importers ignore unrecognized section types, allowing additive entity types in future schema versions without breaking existing import flows.
+- **Schema version validation** — incompatible schema versions are rejected with a descriptive error message.
+
+### Import Capabilities
+
+- **Referential integrity validation** — unknown entity references are rejected.
+- **Import preview** — diff-style preview UI before committing. Shows entities to be created, updated, or skipped.
+- **Conflict resolution** — skip, overwrite, or rename conflicting entities.
+- **Admin-only import** — import operations require elevated permissions. All imports are audit-logged.
+- **Unknown type discriminators** — definitions containing unknown type discriminators (e.g., unrecognized targeting strategy types) are rejected during import with a descriptive error identifying the unknown type.
+- **Missing entity version references** — if an imported workflow references a task version, trigger version, or schedule version that does not exist in the target environment, the import fails with a validation error listing all unresolved references. The import preview surfaces these errors before commit. Partial imports are not supported — all entity references must be resolvable for the import to succeed.
+- **Configurable size limits** — maximum payload size, maximum workflow step count, and maximum nesting depth.
+
+### Security
+
+- **Credential stripping** — exports never include credential values. Credentials require manual re-entry or a separate secure import process.
+- **Redacted variable stripping** — variables flagged with `redact from logs` have their default values and resolved values stripped from exports. Redact-flagged variables are treated as sensitive data for export purposes.
+
+### Use Cases
+
+- **CI/CD integration** — store workflow definitions in source control; deploy across environments via API.
+- **Environment replication** — export a full environment and import elsewhere for configuration-level recovery or environment promotion (dev → staging → prod).
+- **Configuration as code** — manage Werkr configuration alongside application infrastructure.
+
+---
+
+## 7. User Interface
+
+The UI is a Blazor Server web application with a workflow-centric experience.
+
+### Navigation & Layout
+
+- **Workflows as default landing page** — the root route displays a workflow dashboard.
+- **Workflow dashboard** — tags, next scheduled run, last run status, step count, search, and filters for at-a-glance operational awareness.
+- **Navigation hierarchy** — workflows occupy top-level navigation. Tasks, Schedules, and Administration are grouped as secondary.
+- **Global search** — keyboard-accessible command palette (Cmd+K / Ctrl+K) for fast navigation to any workflow, task, agent, or setting from any page.
+- **Saved filter views** — named filter combinations persisted per user for repeated monitoring scenarios. Users can share saved views, which are then persisted to the server and available to other users. Shared saved filter views have defined permissions assignable to roles. By default, Operators and Admins can create shared views; all users can use them.
+- **Adaptive navigation** — the navigation structure adapts to the active platform configuration.
+
+### Run Monitoring
+
+**Multi-View Run Detail**
+
+| View | Description |
+|------|-------------|
+| **Compact** | Summary with status, duration, step counts. |
+| **Timeline** | Gantt-style chronological step execution. |
+| **Grid** | Step × run matrix for cross-run pattern recognition. |
+| **Log** | Raw output stream. |
+
+**Real-Time Updates**
+- SignalR push for step status changes, log output, and progress indicators. Updates arrive live without polling.
+- Update latency target: < 500ms from event to UI.
+- Graceful degradation when SignalR connection drops — the UI continues to function with a visible reconnection indicator.
+
+**Status Visualization**
+- Color-coded DAG nodes showing per-run execution status.
+- Waiting state categorization: waiting for event, resource, host availability, approval.
+- Event grouping for repeated events (e.g., retry loops) collapsed into grouped summaries.
+
+**Grid View**
+- 2D matrix: rows = steps (topological order), columns = runs (chronological). Each cell represents one step's execution in one run.
+- Color-coded cells for step status.
+- Click cell to view step details.
+- Filter by step name, status, time range.
+- Paging for run columns with configurable defaults to manage large run histories.
+
+**Sparkline Run History**
+- Mini bar charts on the workflow list.
+- Bar height = duration (relative to workflow's history).
+- Bar color = status.
+- Quick pattern recognition for anomalies.
+
+### DAG Visualization (Read-Only)
+
+- **High-performance graph renderer** — JavaScript graph library with Blazor interop for rendering, pan, zoom, and layout.
+- **Custom styled nodes** — rich HTML nodes displaying step name, type, status, and error handling indicator.
+- **Automatic hierarchical layout** — automatic DAG layout with manual position override. Positions are persisted per workflow.
+- **Viewport virtualization** — only nodes and edges within the visible viewport are rendered. Edge culling for off-screen elements.
+- **Layout caching** — computed layouts are cached client-side to eliminate redundant layout computations on subsequent views.
+- **Navigation aids** — minimap, zoom/pan, fit-to-content.
+- **Parallel grouping** — visual background lanes for concurrent steps, making parallelism immediately apparent.
+
+### Interactive DAG Editor
+
+**Canvas Interactions**
+- Categorized step palette with drag-to-canvas.
+- Port-based connections (input top, output bottom) with visual feedback during drawing.
+- Smart insertion — drop a connection onto empty canvas space to create a new node pre-connected; drop a node onto an existing edge to insert it inline.
+- Grid snapping and alignment guides.
+- Zoom, pan, fit-to-content.
+
+**Editing Operations**
+- Undo/redo with transaction-based history. Undo history retains at least 100 operations per editing session. Grouped operations (e.g., delete node + its connections) undo as a single step.
+- Cut, copy, paste nodes and edges.
+- Multi-select with Shift+click or marquee.
+- Delete with confirmation for connected nodes.
+- Keyboard shortcuts: Delete, Ctrl+Z, Ctrl+Y, Ctrl+C, Ctrl+V, Ctrl+A.
+
+**Validation**
+- Client-side cycle detection on every connection prevents invalid DAG structures.
+- Server-side validation as defense in depth.
+- Visual feedback for invalid operations.
+
+**Configuration**
+- Inline configuration panel for selected node — full step property editing (task reference, variable bindings, error handling, targeting) without leaving the editor.
+- Dirtiness tracking — visual indicator on nodes modified since the last execution.
+
+**Architecture**
+- The DAG editor maintains a complete client-side interaction model. Drag, connect, snap, and zoom operations are handled entirely client-side for immediate responsiveness. The server is notified on state-change commits (save, undo checkpoints), not on every interaction.
+
+**Editor Modes**
+- Visual DAG editor (default).
+- JSON view (read-write) — displays and allows editing of the underlying workflow definition as JSON. The editor validates JSON structure and workflow schema on save. Switching between visual and JSON modes triggers a parse-and-validate cycle. Only one mode is active at a time (visual OR JSON, not simultaneous).
+
+### Workflow Version Diff View
+
+- Side-by-side comparison of any two versions of a workflow.
+- Step additions, deletions, and modifications highlighted.
+- Connection changes highlighted.
+- Navigation between changes (next change / previous change).
+
+### Timeline View
+
+- **Gantt-style visualization** — horizontal bars per step showing duration, status, and timing relative to other steps in the run.
+- **Real-time liveness** — bars grow during active runs.
+- **Zoom and pan** — horizontal zoom/pan on the time axis.
+- **Saved filter views** — see §7 Navigation & Layout for saved filter view details.
+
+### Pending Actions View
+
+- **Centralized approval queue** — all workflows currently paused at manual approval gates, aggregated across the platform.
+- **Approval context** — shows the workflow name, run ID, step name, requester, and time waiting.
+- **One-click approve/reject** — with required comment on rejection.
+
+### Color Standard
+
+Unified semantic status colors across every surface (dashboards, badges, DAG nodes, grid cells, timeline bars) for both dark and light themes:
+
+| Status | Color |
+|--------|-------|
+| Succeeded | Green |
+| Failed | Red |
+| Running | Yellow / Amber |
+| Pending | Purple |
+| Queued | Indigo |
+| Skipped | Gray |
+| Waiting for Approval | Blue |
+| Recovered | Light Green |
+| Upstream Failed | Orange |
+| Cancelled | Gray / Dark Gray |
+| Paused | Teal |
+
+- CSS custom property system — all colors defined as CSS variables, enabling theme customization.
+
+### Audit Log UI
+
+- **Admin-visible audit trail** — searchable record of who performed what action and when.
+- **Covered actions** — workflow creates/edits/deletes, task execution, user management, role and permission changes, configuration changes, agent registration/deregistration, credential access, trigger configuration, approval gate decisions, data retention operations, import/export operations, authentication events (successful and failed login attempts, account lockouts, 2FA failures).
+- **Search and filter** — filter by user, action type, entity type, entity ID, event category, module, and time range.
+- **Export** — audit log entries are exportable for compliance and external analysis.
+- **Dynamic event types** — event types are registered by system components at startup. Event categories appear in filter options automatically.
+
+---
+
+## 8. Notifications
+
+Platform-level notification channels that alert operators to workflow and system events.
+
+### Channel Architecture
+
+Notifications are delivered through a channel-based abstraction. Each channel type implements a common delivery interface:
+
+| Channel | Description |
+|---------|-------------|
+| **Email** | SMTP-based email delivery with configurable sender, subject templates, and HTML body. |
+| **Webhook (HTTP Callback)** | HTTP POST to a configured URL with a JSON payload describing the event. Supports configurable authentication: header-based (custom header with secret value) and HMAC-SHA-512 signature verification (body signed with shared secret using HMAC-SHA-512, signature in header). This notification channel is distinct from the Send Webhook action handler (§3 Built-in Action Handlers). |
+| **In-App** | SignalR-based browser notifications for users currently in the UI. In-App notifications are persisted to the database. Users receive queued notifications upon next login. |
+
+### Channel Management
+
+- Channels are configured once at the platform level. Channels are shared platform infrastructure — any platform component may deliver notifications through configured channels.
+- **Channel delivery interface** — the channel delivery interface is a standalone service that accepts delivery requests (recipient, channel, template, payload). The subscription model is one routing layer that produces delivery requests; other system components may produce delivery requests through the same interface.
+- **Channel test operations** — administrators can send a test notification to verify channel configuration before relying on it in production.
+
+### Notification Templates
+
+- **Templated message payloads** — customizable message templates per channel and event type.
+- Default templates ship with the platform for all event types and channels.
+- Template variables include (but are not limited to): workflow name, run ID, step name, error summary, timestamp, and direct links to run detail pages. The complete list is documented per event type in the platform documentation.
+
+### Subscription Model
+
+- **Per-workflow opt-in** — each workflow can opt into notifications for failure, success, or completion events. Defaults to off.
+- **Tag-based subscriptions** — subscribe to events for all workflows matching a tag, enabling team-level alerting without per-workflow configuration.
+- **Per-user notification preferences** — users configure their personal notification preferences including opted-in event types and preferred delivery channels.
+- **Event-type subscriptions** — subscriptions can target the following event categories:
+ - **Workflow execution** — run started, run completed, run failed.
+ - **Approval** — approval requested, approved, rejected, timed out.
+ - **Schedule** — trigger fired, trigger suppressed (holiday).
+ - **Security** — authentication failure, authorization failure, key rotation.
+ - **System** — agent online, agent offline, configuration change.
+
+Event categories are registered at application startup. Each specifies a unique ID, display name, and default subscription behavior. 1.0 categories: Workflow execution, Approval, Schedule, Security, System.
+
+### Delivery Behavior
+
+- **Failure context** — failure notifications include the failed step name, error summary, run ID, and a direct link to the run detail page.
+- **Delivery tracking** — each notification records its delivery status (sent, failed) with timestamps.
+- **Retry** — failed deliveries are retried with configurable retry count and backoff. The retry queue is persisted (surviving service restart). Deliveries that exhaust all retry attempts are recorded as permanently failed with a configurable dead-letter retention window.
+- **Audit logging** — channel configuration changes, subscription changes, and delivery outcomes are audit-logged.
+
+---
+
+## 9. Security
+
+### Transport & Communication
+
+- **TLS mandatory** — all connections (browser → Server, Server → API, API → Agent) require HTTPS/TLS. URL scheme validation is enforced at registration, channel creation, and gRPC channel construction. HTTP URLs are explicitly rejected.
+- **Encrypted gRPC payloads** — all API ↔ Agent gRPC payloads are wrapped in an AES-256-GCM encrypted envelope. Payload content remains encrypted independent of transport layer. The envelope supports arbitrary inner payload types, enabling new gRPC services to use the same encryption without modifying the envelope contract.
+- **User-scoped API forwarding** — API calls originating from the UI carry the authenticated user's identity, role, and permissions. UI actions are authorized at the user's permission level, not an elevated service account. Background server-initiated operations (health monitoring) use a separate administrative channel.
+- **System service identity** — trigger-initiated workflow execution (schedule, file monitor, workflow completion, API triggers) uses a system service identity. Trigger configuration requires elevated permissions, which gates what workflows can be auto-triggered.
+
+### Authentication
+
+- **ASP.NET Identity** — user management with password hashing, account lockout, and email confirmation.
+- **TOTP two-factor authentication** — built-in time-based one-time password 2FA with recovery codes. TOTP enrollment generates 10 single-use recovery codes. Users may regenerate at any time (invalidating all previous codes). Administrators can require 2FA enrollment for users.
+- **Passkey support** — WebAuthn/FIDO2 passkeys supported as both a primary authentication method (passwordless) and as an optional second-factor method. Users can register one or more passkeys alongside or instead of TOTP.
+- **Password policy** — minimum length (≥ 12 characters), no character-class composition rules, and password history enforcement (5 previous passwords, configurable) aligned with NIST SP 800-63B guidelines. Aligned with NIST SP 800-63B §5.1.1.2.
+- **Login rate limiting** — per-IP rate limits on authentication endpoints to mitigate credential stuffing and brute-force attacks, complementing per-account lockout.
+- **2FA enforcement** — administrators can require 2FA enrollment for all users or specific roles.
+
+### Authorization
+
+- **Custom roles with granular permissions** — administrators create custom roles and assign fine-grained permissions. The permission model uses a hierarchical `resource:action` naming convention (e.g., `workflows:execute`, `agents:manage`, `settings:write`, `views:create`, `views:share`).
+- **Permission registration** — permissions are registered at application startup.
+- **Policy-based authorization** — every API endpoint and UI page is protected by permission-based policies rather than fixed role checks.
+- **Built-in roles** — Admin, Operator, and Viewer ship as non-deletable default roles with predefined permission sets.
+ - **Admin** — all permissions.
+ - **Operator** — create, read, update, execute operations.
+ - **Viewer** — read-only access.
+- **Per-workflow execution permissions** — roles may be granted or denied execution permission on specific workflows.
+- **Role management UI** — create roles, assign permissions via a matrix interface, and map users to roles.
+- **Scoped permissions** — permissions are organized under their owning domain namespace. All registered permissions appear in the role management UI.
+
+### API Keys
+
+Programmatic access for CI/CD pipelines, external integrations, and automation:
+
+- **Key lifecycle** — create, revoke, and rotate API keys via the UI and API. Keys are displayed once at creation and stored hashed.
+- **Expiration** — configurable expiration dates. Last-used timestamp tracking.
+- **Permission scoping** — at creation, the user selects which of their permissions the key carries; all permissions are selected by default. If the creator's permissions are subsequently **reduced** (role demotion or permission removal), all active API keys for that user are fully revoked. The user must create new API keys after their permissions change. Permission **additions** to the creator's role do not retroactively expand existing keys or require key recreation. Keys cannot exceed the creator's current permissions.
+- **Rate limiting** — per-key rate limits. There are no concurrency limits on simultaneous use of the same API key from multiple clients.
+- **Audit logging** — all key creation, revocation, rotation, and usage events are recorded.
+
+### Agent Registration
+
+- **Encrypted bundle exchange** — agents register using a password-encrypted bundle containing the API's public key and a correlation token. Administrator-created bundles with configurable expiration.
+- **Encrypted envelope registration** — all registration fields (agent URL, name, bundle ID, public key) are protected in a single encrypted envelope. A non-secret hash-based lookup prevents leaking registration data.
+- **RSA + AES hybrid encryption** — the agent's public key is hybrid-encrypted with the API's public key during registration.
+- **Shared key establishment** — a shared symmetric key is established during registration for all subsequent encrypted communication.
+- **Shared key rotation** — periodic key rotation initiated by the API. During rotation, both the current and previous keys are valid for a configurable grace period (default: 5 minutes) to avoid disrupting in-flight messages. After the grace period, the previous key is invalidated. Key rotation events are audit-logged.
+
+**Key rotation failure modes** — if an agent is unreachable during key rotation, the API retains the current key and retries rotation on the next successful heartbeat. If an agent presents an expired key after the grace period, the API rejects the request and the agent must re-register. Envelope version mismatches (e.g., agent using an older envelope format) are rejected with a descriptive error; the agent logs the failure and attempts reconnection with the current envelope version. All key rotation failures are audit-logged.
+
+### Agent Management
+
+- **Agent dashboard** — status overview showing each agent's heartbeat state (online, offline), last-seen timestamp, OS, platform, version, and reported capabilities.
+- **Heartbeat configuration** — agent heartbeat interval: 30 seconds (configurable). An agent is considered offline after 3 consecutive missed heartbeats (90 seconds, configurable).
+- **Agent deregistration** — decommission an agent by revoking its keys and cleaning up references. Audit-logged.
+- **Capacity configuration** — maximum concurrent tasks per agent. When all matched agents are at capacity or offline, queued work waits with visibility into the wait reason.
+- **Agent offline mid-job** — when an agent becomes unreachable mid-job, the API considers the agent's in-flight jobs as still running. Jobs transition to failed when the first of the following thresholds is exceeded: (1) task maximum run duration, (2) agent heartbeat timeout, (3) workflow-level timeout.
+- **System-generated agent tags** — each agent receives a unique, system-generated tag (`agent:{agent-id}`) at registration time. This tag is non-editable and non-deletable. It enables precise single-agent targeting when workflows or steps need to execute on a specific host.
+
+### User Management
+
+- **User invitation** — administrators create user accounts with initial role assignments.
+- **Password reset** — self-service forgot-password flow via email.
+- **User deactivation** — suspend a user without deleting their account or audit history.
+- **Session management** — administrators can view and revoke active user sessions. Revoked sessions are invalidated immediately; the affected user is required to re-authenticate. Default maximum session count per user: 5. When exceeded, the oldest session is automatically revoked.
+- **User activity audit logging** — user lifecycle events and session events are audit-logged.
+
+### Data Protection
+
+- **Database encryption at rest** — transparent column-level AES-256-GCM encryption for sensitive data (credentials, variable values, connection strings, API key hashes). Platform-appropriate key management (DPAPI on Windows, Keychain on macOS, protected file on Linux). Key rotation with zero-downtime re-encryption. Migration tool for encrypting existing data on upgrade.
+- **Path allowlisting** — agents validate all file paths against a configured allowlist before execution. The allowlist supports standard glob patterns (`*` for multiple character wildcards and `?` for single character wildcards). The allowlist serves as a guardrail against accidental or malicious access to unauthorized filesystem locations. Canonical path resolution, 8.3 short-path expansion (Windows), symlink resolution, and rejection of traversal sequences and dangerous path prefixes prevent unauthorized access.
+
+Path allowlists are configured per-agent through the agent settings UI. Each agent's allowlist defines which filesystem paths the agent is permitted to access during task execution. The default posture is deny-all — agents with an empty allowlist cannot access any filesystem paths. Administrators configure allowlists individually per agent. Allowlist changes are audit-logged and distributed to the agent via the encrypted gRPC configuration synchronization channel.
+
+- **Platform-native secret storage** — bootstrap credentials (database connection strings, Kestrel bindings) stored in OS-native secret stores rather than plaintext configuration files. On Linux: a file with restricted permissions (owner-only read, mode 0600) in a platform-standard directory (e.g., `/etc/werkr/keys/`).
+
+### Outbound Request Controls
+
+- **URL allowlisting** — the HTTP Request, Send Webhook, and File Download/Upload action handlers validate target URLs against a configurable allowlist. Requests to URLs not on the allowlist are rejected.
+- **Private network protection** — requests to private/internal IP ranges (RFC 1918, link-local, loopback) are blocked by default. An explicit override is required to permit internal network targets.
+- **DNS rebinding protection** — resolved IP addresses are validated against the allowlist after DNS resolution to prevent DNS rebinding attacks.
+
+### Compliance Alignment
+
+The security architecture aligns with OWASP Top 10 mitigations and NIST SP 800-63B authentication guidelines. Specific compliance mapping is maintained in the security documentation.
+
+### Content Security Policy
+
+The Blazor Server UI enforces Content Security Policy (CSP) headers with directives appropriate for Blazor Server rendering, SignalR connections, and JavaScript interop for the DAG editor.
+
+---
+
+## 10. Agent Architecture
+
+### Three-Tier Topology
+
+| Component | Role | Database |
+|-----------|------|----------|
+| **Werkr.Server** | Blazor Server UI, identity provider, user authentication | Identity DB (PostgreSQL or SQLite) |
+| **Werkr.Api** | REST API, gRPC host, application logic | Application DB (PostgreSQL or SQLite) |
+| **Werkr.Agent** | Task execution, schedule evaluation, gRPC services | Local DB (PostgreSQL or SQLite) |
+
+The Server has no direct communication with the Agent. All agent management flows through the API.
+
+### Communication Model
+
+| Path | Protocol | Purpose |
+|------|----------|---------|
+| User → Server | HTTPS | Browser sessions (Blazor Server + SignalR) |
+| User → API | HTTPS/REST | Direct REST API access |
+| Server → API | HTTPS/REST | Server calls API endpoints; Server is not aware of agents |
+| API ↔ Agent | gRPC over TLS | All agent interaction — registration, schedule sync, job reporting, command dispatch, configuration push |
+
+After agent registration and the initial heartbeat, the primary communication pattern is **agent-initiated**: agents establish and maintain persistent gRPC connections to the API. The API pushes notifications and commands through these agent-initiated connections. Administrators must configure and enable network access from agents to the API on the configured gRPC port.
+
+**Agent-hosted services** (listed below as "API → Agent") operate over the agent-initiated persistent connection — the API sends requests through the existing agent-maintained channel. These do not require inbound network connectivity to the agent. A limited set of features (e.g., server address rebroadcast after API address change) may require true API-initiated connections to the agent; if the agent is behind a NAT or firewall without inbound connectivity, these operations will fail and may require manual resolution on the agent.
+
+### Capability Registration
+
+- Agents report their capabilities (supported task types, installed action handlers, OS platform, architecture, agent version) to the API during registration and via periodic heartbeat.
+- The API uses reported capabilities for routing decisions and validates that a target agent supports the required capabilities before dispatching work.
+- Capabilities are displayed on the agent dashboard.
+- Capability versioning for compatibility tracking.
+
+### Module Architecture
+
+The agent supports a modular architecture:
+
+- **Module lifecycle** — modules register through a defined lifecycle (initialization, startup, shutdown). Modules register their own gRPC services, background tasks, configuration handlers, and local database tables.
+- **Module contract** — modules implement a standard lifecycle interface with `Initialize()`, `Configure()`, `Start()`, and `Stop()` methods. `Initialize()` is called first for dependency and service registration. `Configure()` is called next to apply configuration. `Start()` begins the module's runtime operations. `Stop()` is called during shutdown for resource cleanup — it does not imply deactivation. Modules register gRPC services through the agent's service registration mechanism during initialization. Module database tables use a module-specific schema prefix. Module activation state is managed via the centralized configuration system.
+- **Module isolation** — each module manages its own lifecycle without affecting other modules or core agent functionality. Module-specific database tables do not conflict with core agent schema or other modules.
+- **Module activation** — configuration-driven activation of extension modules.
+- **Module configuration** — modules receive configuration from the centralized configuration system via the existing encrypted gRPC channel.
+- **Installer layout** — the agent installer uses a modular directory layout.
+- **Core independence** — the core agent runtime operates independently of extension modules. Built-in modules are foundational and always loaded.
+- **1.0 modules** — two built-in modules ship with 1.0: TaskExecution (core task execution engine, including script and command execution) and DefaultActions (built-in action handlers for non-script/command task types). DefaultActions is the default module — action handler execution routes through it. All script and command execution (PowerShell Script, PowerShell Command, Shell Script, Shell Command) is handled by the TaskExecution module. TaskExecution is always active and cannot be deactivated. DefaultActions is active by default but can be deactivated by administrators via the agent configuration UI and during installation. When DefaultActions is deactivated, only script and command task types are available on that agent. Module names use PascalCase.
+
+### Module Database Migration
+
+Each module provides its own `DbContext` with an independent migration history. Module-specific database tables use a module-specific schema prefix (e.g., `modulename_*`) to avoid conflicts with core agent tables or other modules. Module uninstallation does not automatically drop tables — a separate administrative cleanup and migration tool is provided.
+
+### Action Handler Discovery
+
+- Action handlers implementing the handler interface are automatically discovered and registered at startup via assembly scanning.
+- Handlers are organized into categories for the step palette and the API.
+
+### gRPC Services
+
+All gRPC communication between the API and Agent uses the encrypted envelope pattern after initial registration.
+
+**API-hosted services** (Agent → API):
+- Agent registration handshake.
+- Schedule synchronization.
+- Job result reporting.
+- Workflow execution acknowledgment.
+- Trigger-fired notification (agent reports that a schedule or file trigger has fired).
+
+**Agent-hosted services** (API → Agent):
+- Connection management (heartbeat with pending-approval state sync, key rotation).
+- Schedule invalidation push notifications.
+- Job output retrieval on demand.
+- Action execution streaming.
+- Shell/PowerShell execution streaming.
+- Configuration synchronization.
+- Approval decision push notifications.
+
+gRPC services are independently registered. Adding new services does not require modifying existing registrations. Proto file organization follows domain-based namespace conventions.
+
+All API ↔ Agent communication is push-based. Neither the API nor the agent polls the other for state changes.
+
+### gRPC Flow Control
+
+- All gRPC services share a standard response pattern for backpressure signaling (throttle status, retry-after hints).
+- Bounded ingestion for high-frequency gRPC services (status reporting, job result submission) using an accept-queue-process pattern with configurable queue depths.
+
+### Agent Version Compatibility
+
+- Agents report their version during registration and via heartbeat.
+- The API tracks a minimum compatible agent version.
+- Agents below the minimum compatible version are rejected at registration with a descriptive error message.
+- During rolling upgrades, agents running the previous minor version remain compatible with the current API version.
+- Capability reporting (see §10 Capability Registration) serves as the feature-level compatibility mechanism — the API verifies that a target agent supports the required capabilities before dispatching work.
+- Agent updates are managed manually by administrators.
+
+### Resource Management
+
+- A **capacity unit** represents one actively executing workflow task. Background operations (e.g., configuration synchronization, schedule evaluation) do not consume capacity units.
+- Maximum concurrent task enforcement per agent.
+- Task queuing when at capacity.
+- Resource cleanup on task completion.
+- Graceful shutdown with task completion.
+- **Output size limits** — configurable maximum output size per task execution on the agent. Output exceeding the limit is truncated with a marker indicating truncation.
+
+---
+
+## 11. Centralized Configuration
+
+### Database-Backed Settings
+
+- Runtime configuration stored in the application database for all non-startup settings.
+- Minimal bootstrap settings remain file-based: database connection string, Kestrel binding, and log level. Startup secrets are stored in the OS's default credential storage.
+- All other settings are managed centrally through the UI and API.
+
+### Settings Management UI
+
+- View and edit configuration values organized by category (server, agent, workflow, security, network).
+- Input validation per setting type with immediate feedback.
+- Change preview before commit.
+
+### Encrypted Credential Storage
+
+- Credentials (SMTP passwords, API keys for integrations, connection strings) are encrypted at rest in the configuration database using the platform's column-level AES-256-GCM encryption (see §9 Data Protection).
+- Distribution to agents via encrypted gRPC on demand.
+- Per-agent credential scoping — agents only receive credentials assigned to them.
+
+### Credential Management
+
+Credentials are named, encrypted entities managed through the Settings UI and REST API. Each credential has: name, type (password, API key, connection string, certificate), encrypted value, and agent scope assignments. Credentials are referenced by name in task configurations. Values are never exposed in UI or API responses after creation — only masked placeholders are displayed. Credential changes are audit-logged.
+
+**Credential reference integrity** — when a credential is renamed, all task configurations referencing the credential by its previous name are automatically updated to reflect the new name within the same transaction. Credential deletion is blocked while active task configurations reference the credential; administrators must remove or reassign credential references before deletion. Referential integrity is enforced at the application level.
+
+### Per-Agent Configuration
+
+- Hierarchical configuration with ordered scope levels. In 1.0, two scope levels are active: global defaults and per-agent overrides. The configuration data model stores scope-level metadata per entry, supporting additional intermediate scope levels without schema changes.
+- Override inheritance and merge semantics.
+- Clear indication of overridden values in the UI.
+- Configuration supports typed policy documents (structured JSON payloads) in addition to simple key-value settings.
+
+### Configuration Versioning
+
+- All changes tracked: who changed what setting and when.
+- Change history browsable in the UI.
+- Configuration changes feed into the audit log.
+
+### Hot Reload
+
+- Configuration changes take effect without restart where feasible. Action handler configuration updates (enable/disable, parameter defaults) are hot-reloaded. New handler binaries require a restart for assembly discovery.
+- Agent notification via gRPC push.
+- Agents cache configuration locally for offline operation.
+- Version-based delta synchronization on agent reconnect.
+
+A complete configuration reference documenting all parameters, defaults, valid ranges, and descriptions is published in the platform documentation.
+
+---
+
+## 12. Data Management & Retention
+
+### Retention Policies
+
+Configurable retention policies control database growth:
+
+- **Per-entity-type retention rules** — separate retention windows for workflow runs, job output, variable versions, and audit logs. Each entity type is configured independently with time-based thresholds.
+- **Default retention periods** — workflow runs: 180 days; audit logs: 365 days. All defaults are configurable. Retention periods accept any value from 0 with no upper bound. A retention period of 0 deletes eligible records on the next retention sweep cycle (minimum sweep interval: 15 minutes). The UI displays a confirmation warning when a retention period is set below 7 days.
+- **Separate audit log retention** — audit logs have a distinct retention window, defaulting to longer than operational data, for compliance requirements.
+- **Retention registry** — retention policies are registered per entity type. The cleanup pipeline evaluates all registered policies independently.
+
+### Retention Execution
+
+- **Background cleanup** — a hosted service performs periodic retention sweeps at a configurable interval. The sweep interval has a hard minimum of 15 minutes to prevent excessive deletion cycles.
+- **Manual trigger** — administrators may trigger an immediate retention sweep. Returns a summary of what was deleted.
+- **Dry-run mode** — preview what would be deleted without committing.
+- **Audit-logged** — all retention deletions are recorded in the audit log. When audit log entries are deleted by a retention sweep, the deletion event is recorded as a new audit entry summarizing the count and date range of deleted records.
+- **Active work exemption** — runs in non-terminal states (`Pending`, `Queued`, `Running`, `Paused`) are exempt from retention sweeps regardless of age. These runs become eligible for retention only after reaching a terminal state. Trigger events in the dead-letter queue (DLQ) are subject to a separate configurable DLQ retention period (default: 30 days).
+
+### Database Strategy
+
+- **PostgreSQL and SQLite** — dual-provider EF Core architecture with a shared entity model. API and Server default to PostgreSQL; Agent defaults to SQLite.
+- **Separate migration paths** — EF Core migrations maintained separately per database provider.
+- **Feature parity** — both providers pass the full test suite. Concurrency and performance characteristics differ by provider.
+- **Additive schema evolution** — the schema supports additive entity types without requiring modifications to existing entity configurations or migration histories.
+
+---
+
+## 13. REST API
+
+### Versioned API
+
+- All REST endpoints are served under `/api/v1/`. The version prefix is part of the public contract.
+- Existing endpoint contracts remain stable within a major version. New optional fields on existing response DTOs do not require a version increment.
+- Removal or modification of existing fields requires a new API version.
+
+### Endpoint Organization
+
+Endpoints are organized by resource domain:
+
+| Domain | Description |
+|--------|-------------|
+| **Workflows** | CRUD, steps, dependencies, versioning, run management, variable management, approval management, dashboard |
+| **Tasks** | CRUD, cloning, execution |
+| **Schedules** | CRUD, trigger association, holiday calendar management |
+| **Calendars** | Calendar CRUD, holiday rule management |
+| **Agents** | Registration, status, configuration, key rotation, capabilities |
+| **Jobs** | Run listing, output retrieval, bulk operations, status |
+| **Settings** | Configuration CRUD, notification channel management, credential lifecycle (create, read-masked, update, delete, scope-to-agent) |
+| **Users** | User management, role assignment, session management |
+| **Audit** | Audit log query and export |
+| **Triggers** | API trigger endpoints, trigger management |
+| **Dead Letter Queue** | DLQ entry listing, inspection, replay, discard, retention configuration |
+| **Diagnostics** | Health, status, capabilities |
+| **Notifications** | Channel configuration, subscription management |
+| **Retention** | Retention policy management, manual sweep trigger |
+| **Auth** | Authentication, API key management |
+| **Import/Export** | Entity and environment import/export |
+
+The endpoint organization supports additional resource domains under the same versioned API prefix. Platform capabilities may also register additional endpoints under the same versioned API prefix.
+
+### Capabilities Discovery Endpoint
+
+- `GET /api/v1/capabilities` — returns the server version, active feature flags, registered permissions, and system configuration summary.
+- Clients use this endpoint for feature detection and conditional behavior.
+- The UI uses capabilities to conditionally render navigation and features.
+
+### OpenAPI Documentation
+
+- Auto-generated OpenAPI (Swagger) specification published for all versioned endpoints.
+- Interactive API documentation available at a configurable URL for development and integration use.
+
+### Authentication
+
+- **Bearer token (JWT)** — the Server issues short-lived JWTs carrying the user's identity and role claims. JWT lifetime: 15 minutes (configurable). Werkr.Server manages token renewal for browser sessions via sliding expiration. The API validates JWTs on every request. JWTs are used for browser-session-originated requests forwarded by the Server.
+- **API key** — header-based authentication for programmatic access (see §9 API Keys). API key authentication does not use JWTs.
+- Per-endpoint permission requirements enforced via policy-based authorization.
+
+### Rate Limiting
+
+- Per-key rate limits configurable.
+- Per-IP rate limits for unauthenticated endpoints.
+- Rate limit headers in responses.
+- Rate limiting infrastructure extends to gRPC endpoints.
+
+### CORS Policy
+
+- **Same-origin only** — cross-origin requests are rejected. The API does not serve cross-origin responses in 1.0. Server → API calls are server-side HTTP calls (not browser-originated) and are therefore not subject to CORS restrictions; the Server is configured with the API's address for this purpose. Users accessing the API directly (e.g., via the interactive Swagger UI) must target the API's origin directly.
+
+### Pagination & Filtering
+
+- Cursor-based pagination for list endpoints.
+- Consistent filtering and sorting parameters across all list endpoints.
+- Standard response envelope with consistent structure across all endpoints:
+
+```json
+{
+ "data": { },
+ "error": { "code": "", "message": "", "details": [] },
+ "pagination": { "cursor": "", "hasMore": true },
+ "metadata": { "requestId": "", "apiVersion": "" }
+}
+```
+
+`data` and `error` are mutually exclusive. Successful responses populate `data`; error responses populate `error`.
+
+**Pagination behavior** — the first page is requested by omitting the `cursor` parameter (or passing an empty value). The response includes a `cursor` value for the next page and `hasMore: true` if additional results exist. The final page is identified by `hasMore: false`; the `cursor` value in the final page response is null.
+
+---
+
+## 14. Real-Time Communication
+
+### SignalR Architecture
+
+- The Blazor Server UI uses SignalR for real-time push updates (workflow run status, step progress, log streaming, in-app notifications).
+- The SignalR architecture uses independent hubs. Real-time communication patterns are isolated per hub.
+- Per-hub and per-message-type permission checks control access to real-time data.
+- Hub authorization aligns with the hierarchical permission model.
+
+---
+
+## 15. Observability
+
+- **Structured logging** — Serilog with console, file, and OpenTelemetry sinks. Structured log format with correlation IDs.
+- **OpenTelemetry** — metrics, distributed traces, and log export for integration with observability platforms.
+- **Health checks** — `/health` and `/alive` endpoints on every component for load balancer and orchestrator integration.
+- **.NET Aspire** — local development orchestration wiring up all components, databases, and observability with service discovery and resilience.
+
+---
+
+## 16. Platform & Deployment
+
+### Operating Systems
+
+| Component | Windows 11+ | Linux | macOS (Apple Silicon) |
+|-----------|:-----------:|:-----:|:--------------------:|
+| Server | x64, arm64 | x64, arm64 | arm64 |
+| API | x64, arm64 | x64, arm64 | arm64 |
+| Agent | x64, arm64 | x64, arm64 | arm64 |
+
+### Installers & Packaging
+
+| Format | Platforms | Notes |
+|--------|----------|-------|
+| **MSI** | Windows | WiX Toolset-based installers for Server, API, and Agent. |
+| **.pkg** | macOS | Platform-native installer. |
+| **.deb** | Debian / Ubuntu | Linux package distribution. |
+| **Portable archive** | All | Self-contained archive, no installer required. |
+| **Docker** | All | Container images with certificate provisioning. |
+
+Installer layouts support a module directory for optional agent modules delivered as separate packages or updates.
+
+### Database
+
+| Provider | Use Case |
+|----------|----------|
+| **PostgreSQL** | Recommended for API and Server in production. |
+| **SQLite** | Recommended for Agent. Suitable for single-machine deployments. |
+
+- Dual-provider EF Core architecture with configuration-selected provider at deployment time.
+- Feature parity across providers. Concurrency and performance characteristics differ by provider.
+- Full test suite runs on both providers.
+- **Backup and restore** of PostgreSQL and SQLite databases is outside the scope of the 1.0 platform. Deployment documentation covers database backup strategies.
+
+### Docker Support
+
+- Container images for Server, API, and Agent.
+- Certificate provisioning support.
+- Compose file for multi-container deployment.
+- Environment variable configuration.
+
+---
+
+## 17. Audit System
+
+### Audit Log Model
+
+- A unified audit log records all security-relevant and operational events: workflow edits, task execution, user management, configuration changes, agent registration, credential access, trigger configuration, approval decisions, retention operations, notification delivery, import/export operations, and authentication events (successful and failed login attempts, account lockouts, 2FA failures).
+- The audit log uses a typed event model — event types are registered by system components without schema changes.
+- Events carry a structured JSON details payload alongside a typed event identifier and source module tag.
+- All audit entries include the acting user (or system identity), timestamp, affected entity, and action performed. All audit log timestamps are stored in UTC.
+
+### Audit Log Retention
+
+- Audit logs have a separate retention window (default: 365 days, configurable), longer than operational data retention, for compliance requirements.
+
+---
+
+## 18. Performance Targets
+
+The 1.0 platform meets the following baseline performance characteristics:
+
+| Metric | Target |
+|--------|--------|
+| Workflow steps per DAG | ≥ 200 without UI degradation |
+| Concurrent steps per agent | ≥ 50 (baseline: low-overhead operations such as Delay; actual capacity varies by task workload) |
+| DAG render time (100 nodes) | < 2 seconds |
+| Real-time update latency (agent event → UI) | < 500 ms |
+| JSON import (max payload) | ≤ 10 seconds for 10 MB / 500 steps |
+| API response time (CRUD operations) | p95 < 200 ms |
+| Memory at idle | Server < 512 MB, API < 512 MB, Agent < 256 MB (with TaskExecution and DefaultActions active) |
+| Memory under baseline load (50 concurrent low-overhead steps per agent) | Agent < 512 MB. Peak memory under real workloads depends on task type and payload size; this target establishes a platform overhead baseline only. |
+
+---
+
+## 19. Testing Strategy
+
+- **Frameworks** — MSTest for .NET unit and integration tests, Vitest for frontend tests, bUnit for Blazor component tests.
+- **Coverage expectations** — comprehensive unit tests, full integration tests, and end-to-end tests across all major workflow features.
+- **Pre-release requirement** — the platform is validated with fuzz testing before the 1.0 release.
+- Both database providers (PostgreSQL and SQLite) pass the full test suite.
+
+---
+
+## 20. Browser Compatibility
+
+| Browser | Minimum Version |
+|---------|----------------|
+| Chrome / Chromium | 120+ |
+| Firefox | 120+ |
+| Edge | 120+ |
+| Safari | 17+ |
+
+Internet Explorer is not supported. Mobile browsers are not a primary target but should support read-only monitoring.
+
+### Accessibility
+
+- **WCAG 2.1 AA** compliance target.
+- Full keyboard navigation for all interactive elements, including authoring and monitoring workflows.
+ - Note that some areas of the site may not be able to be fully WCAG 2.1 compliant but they must have 100% same-function alternatives available.
+- Visible focus indicators.
+- Semantic HTML structure and accessible labeling for interactive elements.
+- Accessible status communication for live updates, approvals, and errors.
+- Sufficient color contrast (minimum 4.5:1 per WCAG 2.1 AA).
+- Non-color status indicators (icons/labels alongside color).
+- Reduced-motion-safe interaction patterns (respects `prefers-reduced-motion`).
+- Screen reader compatibility.
+- Focus management for modals and dialogs.
+
+---
+
+## 21. Licensing & Community
+
+- **MIT License** — the 1.0 Werkr Workflow Orchestration platform is open-source under the MIT license with zero licensing cost.
+- **GitHub-hosted** — issue templates for bugs, feature requests, and documentation improvements.
+- **Contributor License Agreement** — CLA process for external contributors.
+- **Security vulnerability reporting** — published responsible disclosure process.
+- **Documentation** — architecture guide, security overview, feature reference, development setup guide, user setup guides, and versioned API documentation published via DocFX.
diff --git a/docs/Architecture.md b/docs/Architecture.md
new file mode 100644
index 0000000..f4a221f
--- /dev/null
+++ b/docs/Architecture.md
@@ -0,0 +1,646 @@
+# Architecture
+
+This document describes the stable architectural boundaries of the Werkr project. It covers the system topology, communication model, and key design decisions at a conceptual level. For the definitive 1.0 featureset specification, see [1.0-Target-Featureset.md](1.0-Target-Featureset.md). For vulnerability reporting, see [SECURITY.md](SECURITY.md). For encryption, key management, and secret storage details, see the [Security Architecture](articles/SecurityOverview.md). For build, test, and run instructions, see [Development.md](Development.md). For class-level detail, see the [API documentation](https://docs.werkr.app/api/index.html).
+
+---
+
+## System Overview
+
+Werkr is a self-hosted workflow orchestration platform built on three primary components. The Server and API expose HTTPS endpoints; the API and Agent communicate over encrypted gRPC. After registration and initial heartbeat, agents maintain persistent gRPC connections to the API; the API pushes notifications and commands through these agent-initiated connections. No application-level polling is used for state synchronization.
+
+```mermaid
+flowchart TB
+ User["User (Browser)"]
+ Server["Werkr.Server
+(Blazor UI + Identity)"]
+ Api["Werkr.Api
+(REST API + gRPC Host)"]
+ Agent["Werkr.Agent
+(Task Execution)"]
+ DB_App[("Application DB
+(PostgreSQL or SQLite)")]
+ DB_Id[("Identity DB
+(PostgreSQL or SQLite)")]
+ DB_Agent[("Agent DB
+(PostgreSQL or SQLite)")]
+
+ User -- HTTPS --> Server
+ User -- HTTPS/REST --> Api
+ Server -- REST --> Api
+ Agent -- "gRPC (agent-initiated)" --> Api
+ Api --- DB_App
+ Server --- DB_Id
+ Agent --- DB_Agent
+```
+
+- **Werkr.Server** — The Blazor Server UI and identity provider. Handles user authentication (ASP.NET Identity with RBAC, TOTP 2FA, and WebAuthn passkeys), renders the management interface via Blazor Server and SignalR, and calls the API over REST. Owns the identity database. Has no direct communication with the Agent.
+- **Werkr.Api** — The central application API and workflow orchestrator. Owns the primary application database (tasks, schedules, workflows, triggers, job results, audit logs). Exposes versioned REST endpoints under `/api/v1/` to both end users and the Server. Hosts gRPC services for push-based communication with agents (schedule sync, job reporting, command dispatch, configuration push).
+- **Werkr.Agent** — The worker process that executes tasks on remote hosts. Uses a modular architecture with two built-in modules: **TaskExecution** (PowerShell and shell execution) and **DefaultActions** (built-in action handlers). Reports capabilities to the API during registration and via heartbeat. Maintains its own local database for cached state.
+
+---
+
+## Project Map
+
+| Project | Role |
+|---------|------|
+| `src/Werkr.Server/` | Blazor Server UI, ASP.NET Identity, SignalR hubs, graph-ui TypeScript DAG editor, user authentication and authorization |
+| `src/Werkr.Api/` | Versioned REST API, gRPC service host, workflow orchestration, trigger management, connection management |
+| `src/Werkr.Agent/` | Modular task execution engine, PowerShell host, shell executor, built-in action handlers, capability registration |
+| `src/Werkr.Core/` | Shared business logic — scheduling, workflows, trigger registry, condition evaluator, variable resolution, registration, cryptography, security |
+| `src/Werkr.Common/` | Shared models, all protobuf definitions (`Protos/`), auth policies, permission registration, rendering utilities |
+| `src/Werkr.Common.Configuration/` | Strongly-typed configuration classes for Server, Agent, and UI settings |
+| `src/Werkr.Data/` | EF Core database contexts (PostgreSQL + SQLite), entities, migrations, seeding, audit log entities, retention policies |
+| `src/Werkr.Data.Identity/` | ASP.NET Identity EF Core contexts, roles, permissions, API keys, session management, identity entities |
+| `src/Werkr.AppHost/` | .NET Aspire orchestrator for local development — wires up PostgreSQL, API, Agent, and Server |
+| `src/Werkr.ServiceDefaults/` | Aspire service defaults — OpenTelemetry, health checks, service discovery, resilience |
+| `src/Installer/Msi/` | WiX-based MSI installers and custom actions for Windows deployment |
+
+### Project Dependency Graph
+
+```
+Werkr.AppHost (Aspire orchestrator)
+├── Werkr.Server (Blazor UI)
+│ ├── Werkr.Common → Werkr.Common.Configuration
+│ └── Werkr.Data.Identity → Werkr.Data → Werkr.Common
+├── Werkr.Api (REST + gRPC host)
+│ ├── Werkr.Core → Werkr.Data → Werkr.Common → Werkr.Common.Configuration
+│ ├── Werkr.Common
+│ └── Werkr.Data
+└── Werkr.Agent (task execution worker)
+ ├── Werkr.Core
+ ├── Werkr.Common
+ └── Werkr.Data
+```
+
+All three apps also reference `Werkr.ServiceDefaults`.
+
+---
+
+## Communication Model
+
+Werkr uses two distinct communication protocols depending on which components are talking.
+
+| Path | Protocol | Purpose |
+|------|----------|---------|
+| **User → Server** | HTTPS | Browser sessions (Blazor Server + SignalR) |
+| **User → API** | HTTPS/REST | Direct REST API access |
+| **Server → API** | HTTPS/REST | Server calls API endpoints; Server is not aware of agents |
+| **Agent → API** | gRPC over TLS | All agent interaction — registration, schedule sync, job reporting, configuration push, command dispatch |
+| **Server → API** | SSE (`text/event-stream`) | Real-time workflow run event streaming; relayed to browser via SignalR |
+
+```mermaid
+flowchart LR
+ subgraph "HTTPS"
+ direction LR
+ User["User"] -- HTTPS --> Server
+ User -- "HTTPS/REST" --> Api["API"]
+ Server -- REST --> Api
+ end
+ subgraph "gRPC over TLS"
+ direction LR
+ Agent -- "persistent connection
+(agent-initiated)" --> Api2["API"]
+ Api2 -. "push: commands,
+config, schedules" .-> Agent
+ end
+```
+
+The Server has **no direct communication** with the Agent. All agent management flows through the API.
+
+### Push-Based Communication
+
+After agent registration and initial heartbeat, the primary communication pattern is **agent-initiated**: agents establish and maintain persistent gRPC connections to the API. The API pushes notifications and commands through these agent-initiated connections. Agent-hosted gRPC services (listed below as "API → Agent") operate over these persistent connections — they do not require inbound network connectivity to the agent.
+
+A limited set of features (e.g., server address rebroadcast after API address change) may require true API-initiated connections to the agent; if the agent is behind a NAT or firewall without inbound connectivity, these operations will fail and may require manual resolution.
+
+### Server-Sent Events (SSE)
+
+The API exposes SSE endpoints for real-time event streaming (e.g., workflow run job events at `/api/v1/workflows/runs/{runId}/stream`). The Server's `JobEventRelayService` subscribes to these SSE streams and relays events to connected browsers via SignalR hubs. This creates a three-hop real-time pipeline: Agent → gRPC → API → SSE → Server → SignalR → Browser.
+
+### HTTPS Endpoints
+
+**Server** hosts Blazor Server pages, ASP.NET Identity endpoints (login, 2FA, passkey management, user management), and SignalR hubs for real-time UI updates.
+
+**API** exposes versioned REST endpoints under `/api/v1/` organized by resource domain: Workflows, Tasks, Schedules, Calendars, Agents, Jobs, Settings, Users, Audit, Triggers, Dead Letter Queue, Diagnostics, Notifications, Retention, Auth, and Import/Export. Auto-generated OpenAPI (Swagger) documentation is published for all endpoints. See [1.0-Target-Featureset.md §13](1.0-Target-Featureset.md) for full REST API detail.
+
+### gRPC Services
+
+All gRPC communication is between the API and Agent only. After initial registration, every gRPC payload is wrapped in an `EncryptedEnvelope` — the inner protobuf message is serialized and encrypted with a shared symmetric key established during registration. A `key_id` field supports key rotation so the receiver can accept either the current or previous key during a configurable grace period. See [Security Architecture — Encrypted Envelope](articles/SecurityOverview.md#encrypted-envelope-grpc-payload-encryption) for detail.
+
+All protobuf definitions are in `src/Werkr.Common/Protos/`.
+
+**API-hosted services** (Agent → API):
+- **AgentRegistration** — One-time agent registration handshake (see [Registration Flow](#registration-flow) below). Defined in `Registration.proto`.
+- **ScheduleSync** — Agent pulls assigned schedules and holiday dates. Defined in `ScheduleSync.proto`.
+- **JobReporting** — Agent reports completed job results with output previews. Defined in `JobReport.proto`.
+- **VariableService** — Variable management. Defined in `VariableService.proto`.
+- **WorkflowExecution** — Agent acknowledges workflow execution and reports trigger-fired notifications. *(Planned for 1.0)*
+
+**Agent-hosted services** (API → Agent, via agent-initiated persistent connection):
+- **ConnectionManagement** — Heartbeat with pending-approval state sync, server URL change notifications, shared key rotation. Defined in `ConnectionManagement.proto`.
+- **ScheduleInvalidation** — Push notifications when a schedule is modified or deleted, and `NotifyWorkflowDisabled` notifications when a workflow is disabled. Defined in `ScheduleInvalidation.proto`.
+- **OutputFetch** — Retrieves full job output logs from the agent on demand. Defined in `OutputFetch.proto`.
+- **OutputStreaming** — Streams action execution and shell/PowerShell execution logs in real time. Defined in `OutputStreaming.proto`.
+- **Configuration Synchronization** — Pushes configuration updates to agents. *(Planned for 1.0)*
+- **Approval Decision Push** — Notifies agents of approval gate decisions. *(Planned for 1.0)*
+
+gRPC services are independently registered. Adding new services does not require modifying existing registrations. Proto file organization follows domain-based namespace conventions.
+
+### gRPC Flow Control
+
+All gRPC services share a standard response pattern for backpressure signaling (throttle status, retry-after hints). High-frequency services (status reporting, job result submission) use bounded ingestion with an accept-queue-process pattern and configurable queue depths.
+
+---
+
+## Real-Time Communication
+
+The Blazor Server UI uses SignalR for real-time push updates — workflow run status, step progress, log streaming, and in-app notifications. Updates arrive live without polling. Update latency target: < 500 ms from event to UI.
+
+The SignalR architecture uses independent hubs with per-hub and per-message-type permission checks aligned with the hierarchical permission model. The UI degrades gracefully when the SignalR connection drops, with a visible reconnection indicator.
+
+See [1.0-Target-Featureset.md §14](1.0-Target-Featureset.md) for full detail.
+
+---
+
+## Agent Architecture
+
+### Module Architecture
+
+The agent supports a modular architecture with a defined lifecycle contract:
+
+- **Module lifecycle** — modules implement a standard interface with `Initialize()`, `Configure()`, `Start()`, and `Stop()` methods. `Initialize()` registers dependencies and services. `Configure()` applies configuration. `Start()` begins runtime operations. `Stop()` performs resource cleanup during shutdown.
+- **Module isolation** — each module manages its own lifecycle without affecting other modules or core agent functionality. Module-specific database tables use a schema prefix (e.g., `modulename_*`) to avoid conflicts.
+- **Module activation** — configuration-driven activation of extension modules. Modules receive configuration from the centralized configuration system via the encrypted gRPC channel.
+- **Core independence** — the core agent runtime operates independently of extension modules. Built-in modules are foundational and always loaded.
+
+```mermaid
+flowchart TB
+ subgraph Agent["Werkr.Agent"]
+ Core["Core Runtime
+(lifecycle, gRPC, config)"]
+ subgraph Modules["Modules"]
+ TE["TaskExecution
+(always active)
+PowerShell + Shell"]
+ DA["DefaultActions
+(active by default)
+Built-in Action Handlers"]
+ end
+ Core --> TE
+ Core --> DA
+ end
+ Api["Werkr.Api"] -- "gRPC push
+(encrypted)" --> Core
+ Core -- "gRPC report" --> Api
+```
+
+### 1.0 Modules
+
+| Module | Classification | Description |
+|--------|---------------|-------------|
+| **TaskExecution** | Built-in, always active | Core task execution engine for PowerShell Script, PowerShell Command, Shell Script, and Shell Command task types. Cannot be deactivated. |
+| **DefaultActions** | Built-in, active by default | Built-in action handlers for non-script/command task types (file operations, HTTP requests, process management, etc.). Can be deactivated by administrators — when deactivated, only script and command task types are available. |
+
+### Action Handler Discovery
+
+Action handlers implementing the `IActionHandler` interface (in `Werkr.Core.Operators`) are automatically discovered and registered at startup via assembly scanning. Handlers are organized into categories for the step palette and the API. See [1.0-Target-Featureset.md §3](1.0-Target-Featureset.md) for the full action handler list.
+
+### Capability Registration
+
+Agents report their capabilities (supported task types, installed action handlers, OS platform, architecture, agent version) to the API during registration and via periodic heartbeat. The API uses reported capabilities for routing decisions and validates that a target agent supports the required capabilities before dispatching work. Capabilities are displayed on the agent dashboard.
+
+### Agent Version Compatibility
+
+The API tracks a minimum compatible agent version. Agents below the minimum are rejected at registration with a descriptive error. During rolling upgrades, agents running the previous minor version remain compatible with the current API version. Agent updates are managed manually by administrators.
+
+### Resource Management
+
+A **capacity unit** represents one actively executing workflow task. Background operations (configuration sync, schedule evaluation) do not consume capacity units. Each agent has a configurable maximum concurrent task limit. When all matched agents are at capacity or offline, queued work waits with visibility into the wait reason. Configurable maximum output size per task prevents unbounded growth.
+
+---
+
+## Registration Flow
+
+Agent registration uses an admin-carried bundle model with hybrid asymmetric + symmetric encryption:
+
+1. An administrator creates a **registration bundle** on the Server. The bundle contains a correlation token and the API's public key, encrypted with an admin-supplied password. Bundles have a configurable expiration window.
+2. The administrator transfers the bundle to the Agent (out-of-band).
+3. The Agent decrypts the bundle, generates its own RSA-4096 key pair, and calls the `RegisterAgent` RPC on the API. All registration fields (agent URL, name, bundle ID, public key) are protected in a single encrypted envelope. A non-secret hash-based lookup prevents leaking registration data.
+4. The API validates the bundle correlation token, decrypts the Agent's public key, generates a shared symmetric key, and returns it hybrid-encrypted with the Agent's public key.
+5. Both sides store the shared key. All subsequent gRPC payloads use `EncryptedEnvelope` with this shared key.
+
+Key rotation is supported via the `RotateSharedKey` RPC with a configurable grace period (default: 5 minutes) during which both current and previous keys are valid.
+
+For implementation detail, see `src/Werkr.Core/Registration/` and the protobuf definitions in `src/Werkr.Common/Protos/Registration.proto`. For the full registration protocol, see [Security Architecture — Agent Registration](articles/SecurityOverview.md#agent-registration).
+
+---
+
+## Task Engine
+
+The task engine defines, stores, validates, and executes individual units of work on agents.
+
+- **Five task types** — Action (built-in handlers, no scripting required), PowerShell Script, PowerShell Command, Shell Script, Shell Command. *Script* types reference an executable file on disk. *Command* types are file-less inline executions.
+- **Task versioning** — immutable task versions created on each save. Steps in a workflow reference a specific task version (snapshot binding).
+- **Embedded PowerShell host** — full output stream capture (stdout, stderr, verbose, warning, debug, information), script-level parameter passing, exit code capture.
+- **Native shell execution** — configurable shell per agent (default: `cmd.exe` on Windows, `/bin/sh` on Linux/macOS). Variable escaping per target shell's quoting rules.
+- **Maximum run duration** — tasks exceeding their configured time limit (default: 1 hour) are terminated.
+
+### Step-Level Error Handling
+
+Each workflow step supports a configurable error handling strategy:
+
+| Strategy | Behavior |
+|----------|----------|
+| **Fail Workflow** | Step failure fails the entire workflow (default). |
+| **Skip** | Mark step as skipped; continue to the next step. |
+| **Continue** | Mark step as failed; continue workflow execution to non-dependent downstream steps. |
+| **Run Error Handler** | Exhaust retry attempts, then execute a designated error handler. If handler succeeds, step is recovered. |
+| **Remediate Before Retry** | Execute error handler immediately on failure, before retry attempts begin. |
+
+Retry policies support configurable retry count, backoff strategy (fixed, linear, exponential), initial delay, maximum delay, and optional retry conditions.
+
+See [1.0-Target-Featureset.md §3](1.0-Target-Featureset.md) for full task engine detail.
+
+---
+
+## Scheduling & Triggers
+
+### Unified Trigger Registry
+
+Werkr uses a unified trigger registry. All trigger types share a common definition, configuration, and management interface. Trigger *evaluation* occurs at different system layers depending on type.
+
+| Trigger Type | Evaluation Layer | Description |
+|-------------|-----------------|-------------|
+| **DateTime** | Agent | Execute at a specific date and time. |
+| **Interval / Cyclical** | Agent | Daily, weekly, monthly recurrence with configurable intervals. |
+| **Cron Expression** | Agent | Standard cron expression syntax. |
+| **File Monitor** | Agent | Persistent trigger — watches a directory for file events. |
+| **API** | API | Trigger via authenticated REST API call with payload injection. |
+| **Workflow Completion** | API | Trigger when a specified workflow reaches a terminal state. |
+| **Manual** | API | Execute on demand from the UI or API. |
+
+Trigger types are registered independently. The registry design supports adding new types without modifying existing implementations. When a trigger fires, context data from the trigger source is injected into the workflow run as input variables.
+
+### Trigger-Workflow Version Binding
+
+Triggers have a version binding mode: **Latest** (default, always executes the latest workflow version) or **Pinned** (executes a specific workflow version).
+
+### Schedule Configuration
+
+Schedules support daily, weekly, and monthly recurrence patterns with repeat intervals, start/end time windows, and time zone awareness. A **Holiday Calendar** system allows schedules to skip or shift occurrences on configured holidays, with audit logging for suppressed occurrences. Calendar and holiday data is synchronized to agents via the gRPC schedule synchronization service.
+
+See `src/Werkr.Core/Scheduling/` for the schedule calculator, holiday date service, and occurrence result types. See [1.0-Target-Featureset.md §4](1.0-Target-Featureset.md) for full scheduling and trigger detail.
+
+---
+
+## Workflow Engine
+
+The workflow engine orchestrates multi-step automation as directed acyclic graphs (DAGs).
+
+### DAG Model
+
+Workflows are directed acyclic graphs with topological ordering. Steps declare dependencies on other steps. Cycle detection occurs at save time and runtime. Maximum workflow step count is enforced (target: ≥ 200 steps without UI degradation).
+
+Per-workflow concurrent run limits are configurable (default: unlimited). When the limit is reached, new trigger events are queued in FIFO order. Overflow events beyond the configurable queue depth are persisted to a dead-letter queue (DLQ) for administrative review.
+
+### State Machines
+
+**Step states** — Pending, Queued, Waiting for Approval, Running, Succeeded, Failed, Skipped, Cancelled, Recovered, Upstream Failed. Terminal states: Succeeded, Failed, Skipped, Cancelled, Recovered, Upstream Failed.
+
+**Run states** — Pending, Queued, Running, Paused, Succeeded, Failed, Cancelled. Terminal states: Succeeded, Failed, Cancelled.
+
+During error handler execution and retry cycles, a step remains in `Running`. The `Failed` terminal state is assigned only after all error handling and retry logic is exhausted. See [1.0-Target-Featureset.md §5](1.0-Target-Featureset.md) for full state transition tables.
+
+### Composite Nodes
+
+Four composite node types provide iteration, looping, and conditional control flow. Each composite node encapsulates a nested **child workflow** — the outer DAG sees a single node:
+
+| Type | Behavior |
+|------|----------|
+| **ForEach** | Iterates over a collection variable. Supports sequential and parallel execution modes. |
+| **While** | Evaluates a condition before each iteration; continues while true. |
+| **Do** | Evaluates a condition after each iteration; always executes at least once. |
+| **Switch** | Evaluates an expression against ordered case conditions; routes to exactly one matching branch. Handles all conditional branching (if/else, else-if, multi-way). |
+
+Child workflows are version-bound to the parent. Variable scoping enforces isolation between parent and child — nested composite nodes cannot access grandparent variables unless explicitly mapped through the intermediate child.
+
+### Workflow Variables
+
+Inter-step data passing uses a provider-based variable resolution chain registered via dependency injection at startup.
+
+- **Four namespaces** — `step`, `workflow`, `trigger`, `system`. All variables must be accessed by namespace explicitly (`{{namespace.path}}`).
+- **Types** — string, number, boolean, null, collection (ordered list).
+- **Producer/consumer contracts** — steps declare which workflow variables they produce (write to) and consume (read from), creating explicit data flow contracts.
+- **Output parameters** — workflow-level output variables are published on completion, enabling data transfer between chained workflows via workflow completion triggers.
+- **Log-redaction flag** — variables flagged as "redact from logs" are automatically replaced with `[REDACTED]` in all execution output.
+
+### Re-Execution
+
+- **Retry from failed step** — resume a failed run from the point of failure. Completed outputs preserved.
+- **Replay** — re-execute all steps from the beginning using the original run's workflow version and inputs.
+- **Re-run with modified inputs** — new run of the same workflow version with optionally modified input variables.
+
+### Workflow State Durability
+
+Running workflow state is persisted to the database. Incomplete runs are recovered on service startup — completed steps are not re-executed. The platform provides **at-least-once** execution semantics for steps interrupted during execution.
+
+See `src/Werkr.Core/Workflows/` for the executor, condition evaluator, and run tracker. See [1.0-Target-Featureset.md §5](1.0-Target-Featureset.md) for full workflow engine detail.
+
+---
+
+## Database Strategy
+
+Werkr supports both **PostgreSQL** and **SQLite** as interchangeable database providers. Any component can use either provider, selected at deployment time via configuration. The default configuration uses PostgreSQL for the API and Server, and SQLite for the Agent.
+
+The data layer is organized into two database contexts:
+
+- **Application database** (`WerkrDbContext`) — Tasks, schedules, workflows, triggers, job results, holiday calendars, audit logs, and retention policies. The API and Agent each use their own instance. The Agent uses a subset of the same schema to cache schedules and local state. Managed by `PostgresWerkrDbContext` or `SqliteWerkrDbContext` in `src/Werkr.Data/`.
+- **Identity database** (`WerkrIdentityDbContext`) — Users, roles, permissions, API keys, and session data (ASP.NET Identity). Used by the Server. Managed by `PostgresWerkrIdentityDbContext` or `SqliteWerkrIdentityDbContext` in `src/Werkr.Data.Identity/`.
+
+Both PostgreSQL and SQLite context classes share a common base class and entity model. Provider-specific subclasses handle migration paths and naming conventions (snake_case for PostgreSQL via `EFCore.NamingConventions`).
+
+EF Core migrations are maintained separately per provider:
+- `src/Werkr.Data/Migrations/Postgres/`
+- `src/Werkr.Data/Migrations/Sqlite/`
+- `src/Werkr.Data.Identity/Migrations/Postgres/`
+- `src/Werkr.Data.Identity/Migrations/Sqlite/`
+
+### Module Database Tables
+
+Each agent module provides its own `DbContext` with an independent migration history. Module-specific database tables use a schema prefix (e.g., `modulename_*`) to avoid conflicts with core agent tables or other modules. Module uninstallation does not automatically drop tables — a separate administrative cleanup tool is provided.
+
+### Schema Evolution
+
+The schema supports additive entity types without requiring modifications to existing entity configurations or migration histories.
+
+### Data Retention
+
+Configurable retention policies control database growth with per-entity-type retention windows (workflow runs: 180 days default, audit logs: 365 days default). A background hosted service performs periodic retention sweeps. Active runs in non-terminal states are exempt from retention regardless of age. See [1.0-Target-Featureset.md §12](1.0-Target-Featureset.md) for full retention detail.
+
+---
+
+## Security Model
+
+Security is layered throughout the system. This section provides an architectural overview of each layer and its role. For vulnerability reporting, see [SECURITY.md](SECURITY.md). For full implementation detail on each layer — cryptographic primitives, registration flow, envelope encryption, authentication schemes, authorization, secret storage, agent-side controls, and compliance alignment — see the [Security Architecture](articles/SecurityOverview.md).
+
+```mermaid
+flowchart TB
+ subgraph Transport["Transport Security"]
+ TLS["TLS on all connections
+ URL scheme validation"]
+ end
+ subgraph AppEncrypt["Application-Layer Encryption"]
+ Envelope["EncryptedEnvelope
+ AES-256-GCM
+ (gRPC payloads)"]
+ DBEncrypt["Column-Level Encryption
+ AES-256-GCM
+ (data at rest)"]
+ end
+ subgraph AuthN["Authentication"]
+ JWT["JWT Bearer Tokens"]
+ Cookie["Cookie Auth
+ (browser sessions)"]
+ Passkey["WebAuthn Passkeys"]
+ TOTP["TOTP 2FA"]
+ APIKey["API Keys"]
+ AgentAuth["gRPC Agent Auth
+ (shared key)"]
+ end
+ subgraph AuthZ["Authorization"]
+ RBAC["RBAC
+resource:action permissions
+Policy-based enforcement"]
+ end
+ subgraph DataProt["Data Protection"]
+ SecretStore["Platform-Native Secret Storage"]
+ Redaction["Sensitive Data Redaction"]
+ VarEscape["Variable Escaping"]
+ end
+ subgraph AgentCtrl["Agent-Side Controls"]
+ PathAllow["Path Allowlisting"]
+ URLAllow["Outbound URL Allowlisting"]
+ PrivNet["Private Network Protection"]
+ FileMon["File Monitor Security"]
+ end
+
+ Transport --> AppEncrypt
+ Transport --> AuthN
+ AuthN --> AuthZ
+ AuthZ --> DataProt
+ DataProt --> AgentCtrl
+```
+
+### Transport
+
+All connections (browser → Server, Server → API, API → Agent) require HTTPS/TLS. URL scheme validation is enforced at registration, channel creation, and gRPC channel construction. HTTP URLs are explicitly rejected. See [Security Architecture — Transport Security](articles/SecurityOverview.md#transport-security).
+
+### Payload Encryption
+
+Every gRPC payload after registration is encrypted inside an `EncryptedEnvelope` using AES-256-GCM with a shared symmetric key. The envelope supports arbitrary inner payload types, enabling new gRPC services to use the same encryption without modifying the envelope contract. See [Security Architecture — Encrypted Envelope](articles/SecurityOverview.md#encrypted-envelope-grpc-payload-encryption).
+
+### Authentication
+
+Multiple authentication schemes depending on caller and context:
+
+| Scheme | Use Case |
+|--------|----------|
+| **JWT bearer tokens** | Browser-session-originated requests forwarded by the Server. 15-minute lifetime with sliding expiration. |
+| **Cookie authentication** | Interactive browser sessions with sliding expiration. |
+| **WebAuthn/FIDO2 passkeys** | Primary (passwordless) or second-factor authentication. |
+| **TOTP 2FA** | Time-based one-time passwords with recovery codes. Enrollment enforceable by administrators. |
+| **API keys** | Programmatic access for CI/CD and integrations. Permission-scoped, rate-limited. |
+| **gRPC agent auth** | Agents authenticate via registered shared keys with constant-time comparison. |
+
+Password policy aligned with NIST SP 800-63B (≥ 12 characters, no composition rules, password history enforcement). Per-IP login rate limiting. See [Security Architecture — Authentication](articles/SecurityOverview.md#authentication), [Password Policy](articles/SecurityOverview.md#password-policy), [Two-Factor Authentication](articles/SecurityOverview.md#two-factor-authentication), [API Keys](articles/SecurityOverview.md#api-keys), and [gRPC Agent Authentication](articles/SecurityOverview.md#grpc-agent-authentication).
+
+### Authorization
+
+Permission-based policy authorization enforced on every API endpoint and UI page. Permissions use a hierarchical `resource:action` naming convention (e.g., `workflows:execute`, `agents:manage`). Permissions are registered at application startup.
+
+Three non-deletable built-in roles: **Admin** (all permissions), **Operator** (create, read, update, execute), **Viewer** (read-only). Administrators create custom roles with fine-grained permissions via the role management UI. Per-workflow execution permissions enable granular control over who can trigger specific automations.
+
+See [Security Architecture — Authorization (RBAC)](articles/SecurityOverview.md#authorization-rbac).
+
+### Auth Forwarding & Service Identity
+
+UI-originated API calls carry the authenticated user's identity and are authorized at the user's permission level. Trigger-initiated workflow execution uses a system service identity. See [Security Architecture — Auth Forwarding & Service Identity](articles/SecurityOverview.md#auth-forwarding--service-identity).
+
+### Agent Security
+
+- **Registration** — admin-bundle model with hybrid RSA-4096 + AES-256-GCM encryption. See [Registration Flow](#registration-flow) and [Security Architecture — Agent Registration](articles/SecurityOverview.md#agent-registration).
+- **Key rotation** — periodic shared key rotation with grace period. See [Security Architecture — Key Rotation](articles/SecurityOverview.md#key-rotation).
+- **Path allowlisting** — deny-all default posture; agents validate all file paths against a configured allowlist with canonical path resolution, symlink resolution, and traversal prevention. See [Security Architecture — Path Allowlisting](articles/SecurityOverview.md#path-allowlisting-agent).
+- **Outbound request controls** — URL allowlisting, private network protection (RFC 1918/link-local/loopback blocked by default), DNS rebinding protection. See [Security Architecture — Outbound Request Controls](articles/SecurityOverview.md#outbound-request-controls).
+- **File monitor security** — path validation, debounce, circuit breaker, watch limits. See [Security Architecture — File Monitoring Security](articles/SecurityOverview.md#file-monitoring-security).
+- **API trigger security** — per-workflow rate limiting, request validation, cycle detection. See [Security Architecture — API Trigger Security](articles/SecurityOverview.md#api-trigger-security).
+
+### Data Protection
+
+- **Database encryption at rest** — column-level AES-256-GCM for credentials, variable values, connection strings, API key hashes. Platform-native key management. Zero-downtime key rotation. See [Security Architecture — Database Encryption at Rest](articles/SecurityOverview.md#database-encryption-at-rest).
+- **Secret storage** — OS-native stores per platform (DPAPI on Windows, Keychain on macOS, protected file on Linux). See [Security Architecture — Secret Storage](articles/SecurityOverview.md#secret-storage).
+- **Sensitive data redaction** — variable-level redaction flags and configurable regex patterns mask sensitive data in execution output. See [Security Architecture — Sensitive Data Redaction](articles/SecurityOverview.md#sensitive-data-redaction).
+- **Variable escaping** — workflow variables are escaped per target execution context to prevent injection. See [Security Architecture — Variable Escaping](articles/SecurityOverview.md#variable-escaping).
+
+### Session Management & Content Security Policy
+
+Administrators can view and revoke active user sessions. Default maximum session count per user: 5. The Blazor Server UI enforces Content Security Policy (CSP) headers. See [Security Architecture — User Management](articles/SecurityOverview.md#user-management) and [Content Security Policy](articles/SecurityOverview.md#content-security-policy).
+
+### Compliance
+
+The security architecture aligns with OWASP Top 10 mitigations and NIST SP 800-63B authentication guidelines. See [Security Architecture — Compliance Alignment](articles/SecurityOverview.md#compliance-alignment). See [1.0-Target-Featureset.md §9](1.0-Target-Featureset.md) for the feature-level security requirements.
+
+---
+
+## REST API
+
+All REST endpoints are served under `/api/v1/`. The version prefix is part of the public contract. Existing endpoint contracts remain stable within a major version.
+
+### Endpoint Organization
+
+> This table describes the target v1.0 API surface. Some domains are fully implemented; others are planned and will be added before the 1.0 release. See [1.0-Target-Featureset.md](1.0-Target-Featureset.md) for the full specification.
+
+| Domain | Description |
+|--------|-------------|
+| **Workflows** | CRUD, steps, dependencies, versioning, run management, variable management, approval management |
+| **Tasks** | CRUD, cloning, execution |
+| **Schedules** | CRUD, trigger association, holiday calendar management |
+| **Calendars** | Calendar CRUD, holiday rule management |
+| **Agents** | Registration, status, configuration, key rotation, capabilities |
+| **Jobs** | Run listing, output retrieval, bulk operations |
+| **Settings** | Configuration CRUD, notification channels, credential lifecycle |
+| **Users** | User management, role assignment, session management |
+| **Audit** | Audit log query and export |
+| **Triggers** | API trigger endpoints, trigger management |
+| **Dead Letter Queue** | DLQ entry listing, inspection, replay, discard |
+| **Diagnostics** | Health, status, capabilities |
+| **Notifications** | Channel configuration, subscription management |
+| **Retention** | Retention policy management, manual sweep trigger |
+| **Auth** | Authentication, API key management |
+| **Import/Export** | Entity and environment import/export |
+
+### API Design
+
+- **Standard response envelope** — consistent structure across all endpoints:
+ ```json
+ {
+ "data": { },
+ "error": { "code": "", "message": "", "details": [] },
+ "pagination": { "cursor": "", "hasMore": true },
+ "metadata": { "requestId": "", "apiVersion": "" }
+ }
+ ```
+- **Cursor-based pagination** for all list endpoints.
+- **Authentication** — JWT bearer tokens for browser-originated requests, API keys for programmatic access.
+- **Rate limiting** — per-key and per-IP rate limits with standard rate limit headers.
+- **CORS** — same-origin only in 1.0. Server → API calls are server-side HTTP (not browser-originated).
+- **OpenAPI/Swagger** — auto-generated specification published for all versioned endpoints.
+- **Capabilities discovery** — `GET /api/v1/capabilities` returns server version, active feature flags, registered permissions, and system configuration summary.
+
+See [1.0-Target-Featureset.md §13](1.0-Target-Featureset.md) for full REST API detail.
+
+---
+
+## Notification System
+
+Notifications are delivered through a channel-based abstraction. Each channel type implements a common delivery interface:
+
+| Channel | Description |
+|---------|-------------|
+| **Email** | SMTP-based with configurable sender, subject templates, and HTML body. |
+| **Webhook** | HTTP POST with JSON payload. Supports header-based and HMAC-SHA-512 signature authentication. |
+| **In-App** | SignalR-based browser notifications, persisted to database for offline delivery. |
+
+Channels are configured once at the platform level. The channel delivery interface is a standalone service that accepts delivery requests from any system component.
+
+### Subscription Model
+
+- **Per-workflow opt-in** — each workflow can opt into notifications for failure, success, or completion events.
+- **Tag-based subscriptions** — subscribe to events for all workflows matching a tag.
+- **Per-user preferences** — users configure preferred event types and delivery channels.
+- **Event categories** — Workflow execution, Approval, Schedule, Security, System. Categories are registered at application startup.
+
+Failed deliveries are retried with configurable backoff. The retry queue is persisted, surviving service restart. See [1.0-Target-Featureset.md §8](1.0-Target-Featureset.md) for full detail.
+
+---
+
+## Centralized Configuration
+
+- **Database-backed settings** — runtime configuration stored in the application database for all non-startup settings.
+- **Minimal file-based bootstrap** — database connection string, Kestrel binding, and log level. Startup secrets stored in OS-native credential storage.
+- **Hierarchical configuration** — ordered scope levels: global defaults and per-agent overrides. The data model supports additional intermediate scope levels without schema changes.
+- **Hot reload** — configuration changes take effect without restart where feasible. Agents are notified via gRPC push and cache configuration locally for offline operation.
+- **Encrypted credential storage** — credentials (SMTP passwords, API keys, connection strings) encrypted at rest using column-level AES-256-GCM. Per-agent credential scoping — agents only receive credentials assigned to them.
+- **Configuration versioning** — all changes tracked with who/what/when audit trail.
+
+See [1.0-Target-Featureset.md §11](1.0-Target-Featureset.md) for full detail.
+
+---
+
+## Audit System
+
+A unified audit log records all security-relevant and operational events: workflow edits, task execution, user management, configuration changes, agent registration, credential access, trigger configuration, approval decisions, retention operations, notification delivery, import/export operations, and authentication events.
+
+- **Typed event model** — event types are registered by system components at startup without schema changes.
+- **Structured payload** — each entry carries a typed event identifier, source module tag, structured JSON details, acting user (or system identity), timestamp, and affected entity.
+- **All timestamps in UTC.**
+- **Separate retention window** — default 365 days, configurable independently from operational data retention.
+
+See [1.0-Target-Featureset.md §17](1.0-Target-Featureset.md) for full detail.
+
+---
+
+## Observability
+
+- **Structured logging** — Serilog with console, file, and OpenTelemetry sinks. Structured log format with correlation IDs.
+- **OpenTelemetry** — metrics, distributed traces, and log export for integration with observability platforms.
+- **Health checks** — `/health` and `/alive` endpoints on every component for load balancer and orchestrator integration.
+
+See [1.0-Target-Featureset.md §15](1.0-Target-Featureset.md) for full detail.
+
+---
+
+## Aspire Integration
+
+For local development, `src/Werkr.AppHost/` provides a .NET Aspire orchestrator that wires up:
+- A PostgreSQL container with two databases (`werkrdb` and `werkridentitydb`)
+- The API service (depends on `werkrdb`)
+- The Agent (depends on `werkrdb`)
+- The Server (depends on API, Agent, and `werkridentitydb`)
+
+`src/Werkr.ServiceDefaults/` adds standard Aspire behaviors to each service: OpenTelemetry (logging, metrics, tracing), health check endpoints (`/health` and `/alive`), service discovery, and HTTP client resilience. See [Observability](#observability) for the observability stack.
+
+---
+
+## Platform & Deployment
+
+### Operating Systems
+
+| Component | Windows 11+ | Linux | macOS (Apple Silicon) |
+|-----------|:-----------:|:-----:|:--------------------:|
+| Server | x64, arm64 | x64, arm64 | arm64 |
+| API | x64, arm64 | x64, arm64 | arm64 |
+| Agent | x64, arm64 | x64, arm64 | arm64 |
+
+### Installers & Packaging
+
+| Format | Platforms | Notes |
+|--------|----------|-------|
+| **MSI** | Windows | WiX Toolset-based installers for Server, API, and Agent. |
+| **.pkg** | macOS | Platform-native installer. |
+| **.deb** | Debian / Ubuntu | Linux package distribution. |
+| **Portable archive** | All | Self-contained archive, no installer required. |
+| **Docker** | All | Container images with certificate provisioning and compose file. |
+
+Installer layouts support a module directory for optional agent modules delivered as separate packages.
+
+### Database
+
+| Provider | Use Case |
+|----------|----------|
+| **PostgreSQL** | Recommended for API and Server in production. |
+| **SQLite** | Recommended for Agent. Suitable for single-machine deployments. |
+
+Both providers pass the full test suite. Backup and restore is outside platform scope — deployment documentation covers database backup strategies.
+
+See [1.0-Target-Featureset.md §16](1.0-Target-Featureset.md) for full detail.
diff --git a/docs/CODE_OF_CONDUCT.md b/docs/CODE_OF_CONDUCT.md
index b07cdc8..957702c 100644
--- a/docs/CODE_OF_CONDUCT.md
+++ b/docs/CODE_OF_CONDUCT.md
@@ -1,128 +1,128 @@
-# Contributor Covenant Code of Conduct
-
-## Our Pledge
-
-We as members, contributors, and leaders pledge to make participation in our
-community a harassment-free experience for everyone, regardless of age, body
-size, visible or invisible disability, ethnicity, sex characteristics, gender
-identity and expression, level of experience, education, socio-economic status,
-nationality, personal appearance, race, religion, or sexual identity
-and orientation.
-
-We pledge to act and interact in ways that contribute to an open, welcoming,
-diverse, inclusive, and healthy community.
-
-## Our Standards
-
-Examples of behavior that contributes to a positive environment for our
-community include:
-
-* Demonstrating empathy and kindness toward other people
-* Being respectful of differing opinions, viewpoints, and experiences
-* Giving and gracefully accepting constructive feedback
-* Accepting responsibility and apologizing to those affected by our mistakes,
- and learning from the experience
-* Focusing on what is best not just for us as individuals, but for the
- overall community
-
-Examples of unacceptable behavior include:
-
-* The use of sexualized language or imagery, and sexual attention or
- advances of any kind
-* Trolling, insulting or derogatory comments, and personal or political attacks
-* Public or private harassment
-* Publishing others' private information, such as a physical or email
- address, without their explicit permission
-* Other conduct which could reasonably be considered inappropriate in a
- professional setting
-
-## Enforcement Responsibilities
-
-Community leaders are responsible for clarifying and enforcing our standards of
-acceptable behavior and will take appropriate and fair corrective action in
-response to any behavior that they deem inappropriate, threatening, offensive,
-or harmful.
-
-Community leaders have the right and responsibility to remove, edit, or reject
-comments, commits, code, wiki edits, issues, and other contributions that are
-not aligned to this Code of Conduct, and will communicate reasons for moderation
-decisions when appropriate.
-
-## Scope
-
-This Code of Conduct applies within all community spaces, and also applies when
-an individual is officially representing the community in public spaces.
-Examples of representing our community include using an official e-mail address,
-posting via an official social media account, or acting as an appointed
-representative at an online or offline event.
-
-## Enforcement
-
-Instances of abusive, harassing, or otherwise unacceptable behavior may be
-reported to the community leaders responsible for enforcement at
-community@darkgrey.dev.
-All complaints will be reviewed and investigated promptly and fairly.
-
-All community leaders are obligated to respect the privacy and security of the
-reporter of any incident.
-
-## Enforcement Guidelines
-
-Community leaders will follow these Community Impact Guidelines in determining
-the consequences for any action they deem in violation of this Code of Conduct:
-
-### 1. Correction
-
-**Community Impact**: Use of inappropriate language or other behavior deemed
-unprofessional or unwelcome in the community.
-
-**Consequence**: A private, written warning from community leaders, providing
-clarity around the nature of the violation and an explanation of why the
-behavior was inappropriate. A public apology may be requested.
-
-### 2. Warning
-
-**Community Impact**: A violation through a single incident or series
-of actions.
-
-**Consequence**: A warning with consequences for continued behavior. No
-interaction with the people involved, including unsolicited interaction with
-those enforcing the Code of Conduct, for a specified period of time. This
-includes avoiding interactions in community spaces as well as external channels
-like social media. Violating these terms may lead to a temporary or
-permanent ban.
-
-### 3. Temporary Ban
-
-**Community Impact**: A serious violation of community standards, including
-sustained inappropriate behavior.
-
-**Consequence**: A temporary ban from any sort of interaction or public
-communication with the community for a specified period of time. No public or
-private interaction with the people involved, including unsolicited interaction
-with those enforcing the Code of Conduct, is allowed during this period.
-Violating these terms may lead to a permanent ban.
-
-### 4. Permanent Ban
-
-**Community Impact**: Demonstrating a pattern of violation of community
-standards, including sustained inappropriate behavior, harassment of an
-individual, or aggression toward or disparagement of classes of individuals.
-
-**Consequence**: A permanent ban from any sort of public interaction within
-the community.
-
-## Attribution
-
-This Code of Conduct is adapted from the [Contributor Covenant][homepage],
-version 2.0, available at
-https://www.contributor-covenant.org/version/2/0/code_of_conduct.html.
-
-Community Impact Guidelines were inspired by [Mozilla's code of conduct
-enforcement ladder](https://github.com/mozilla/diversity).
-
-[homepage]: https://www.contributor-covenant.org
-
-For answers to common questions about this code of conduct, see the FAQ at
-https://www.contributor-covenant.org/faq. Translations are available at
-https://www.contributor-covenant.org/translations.
+# Contributor Covenant Code of Conduct
+
+## Our Pledge
+
+We as members, contributors, and leaders pledge to make participation in our
+community a harassment-free experience for everyone, regardless of age, body
+size, visible or invisible disability, ethnicity, sex characteristics, gender
+identity and expression, level of experience, education, socio-economic status,
+nationality, personal appearance, race, religion, or sexual identity
+and orientation.
+
+We pledge to act and interact in ways that contribute to an open, welcoming,
+diverse, inclusive, and healthy community.
+
+## Our Standards
+
+Examples of behavior that contributes to a positive environment for our
+community include:
+
+* Demonstrating empathy and kindness toward other people
+* Being respectful of differing opinions, viewpoints, and experiences
+* Giving and gracefully accepting constructive feedback
+* Accepting responsibility and apologizing to those affected by our mistakes,
+ and learning from the experience
+* Focusing on what is best not just for us as individuals, but for the
+ overall community
+
+Examples of unacceptable behavior include:
+
+* The use of sexualized language or imagery, and sexual attention or
+ advances of any kind
+* Trolling, insulting or derogatory comments, and personal or political attacks
+* Public or private harassment
+* Publishing others' private information, such as a physical or email
+ address, without their explicit permission
+* Other conduct which could reasonably be considered inappropriate in a
+ professional setting
+
+## Enforcement Responsibilities
+
+Community leaders are responsible for clarifying and enforcing our standards of
+acceptable behavior and will take appropriate and fair corrective action in
+response to any behavior that they deem inappropriate, threatening, offensive,
+or harmful.
+
+Community leaders have the right and responsibility to remove, edit, or reject
+comments, commits, code, wiki edits, issues, and other contributions that are
+not aligned to this Code of Conduct, and will communicate reasons for moderation
+decisions when appropriate.
+
+## Scope
+
+This Code of Conduct applies within all community spaces, and also applies when
+an individual is officially representing the community in public spaces.
+Examples of representing our community include using an official e-mail address,
+posting via an official social media account, or acting as an appointed
+representative at an online or offline event.
+
+## Enforcement
+
+Instances of abusive, harassing, or otherwise unacceptable behavior may be
+reported to the community leaders responsible for enforcement at
+community@darkgrey.dev.
+All complaints will be reviewed and investigated promptly and fairly.
+
+All community leaders are obligated to respect the privacy and security of the
+reporter of any incident.
+
+## Enforcement Guidelines
+
+Community leaders will follow these Community Impact Guidelines in determining
+the consequences for any action they deem in violation of this Code of Conduct:
+
+### 1. Correction
+
+**Community Impact**: Use of inappropriate language or other behavior deemed
+unprofessional or unwelcome in the community.
+
+**Consequence**: A private, written warning from community leaders, providing
+clarity around the nature of the violation and an explanation of why the
+behavior was inappropriate. A public apology may be requested.
+
+### 2. Warning
+
+**Community Impact**: A violation through a single incident or series
+of actions.
+
+**Consequence**: A warning with consequences for continued behavior. No
+interaction with the people involved, including unsolicited interaction with
+those enforcing the Code of Conduct, for a specified period of time. This
+includes avoiding interactions in community spaces as well as external channels
+like social media. Violating these terms may lead to a temporary or
+permanent ban.
+
+### 3. Temporary Ban
+
+**Community Impact**: A serious violation of community standards, including
+sustained inappropriate behavior.
+
+**Consequence**: A temporary ban from any sort of interaction or public
+communication with the community for a specified period of time. No public or
+private interaction with the people involved, including unsolicited interaction
+with those enforcing the Code of Conduct, is allowed during this period.
+Violating these terms may lead to a permanent ban.
+
+### 4. Permanent Ban
+
+**Community Impact**: Demonstrating a pattern of violation of community
+standards, including sustained inappropriate behavior, harassment of an
+individual, or aggression toward or disparagement of classes of individuals.
+
+**Consequence**: A permanent ban from any sort of public interaction within
+the community.
+
+## Attribution
+
+This Code of Conduct is adapted from the [Contributor Covenant][homepage],
+version 2.0, available at
+https://www.contributor-covenant.org/version/2/0/code_of_conduct.html.
+
+Community Impact Guidelines were inspired by [Mozilla's code of conduct
+enforcement ladder](https://github.com/mozilla/diversity).
+
+[homepage]: https://www.contributor-covenant.org
+
+For answers to common questions about this code of conduct, see the FAQ at
+https://www.contributor-covenant.org/faq. Translations are available at
+https://www.contributor-covenant.org/translations.
diff --git a/docs/Development.md b/docs/Development.md
new file mode 100644
index 0000000..2cac3fa
--- /dev/null
+++ b/docs/Development.md
@@ -0,0 +1,183 @@
+# Development
+
+This guide covers how to build, run, test, and contribute to the Werkr project. For architectural context, see [Architecture.md](Architecture.md). For the definitive 1.0 featureset specification, see [1.0-Target-Featureset.md](1.0-Target-Featureset.md).
+
+---
+
+## Prerequisites
+
+| Requirement | Details |
+|-------------|---------|
+| **.NET 10 SDK** | See `global.json` for the exact version (`10.0.100` with `latestFeature` roll-forward). |
+| **Docker** | Required for running PostgreSQL locally (via Aspire) and for integration tests (Testcontainers). |
+| **PostgreSQL 17** | Provided automatically by the Aspire AppHost or Docker Compose. No manual install needed if you have Docker. |
+| **PowerShell 7+** | The Agent embeds a PowerShell host — the SDK is useful for running project scripts. |
+| **Node.js 22+** | Required for building and testing the graph-ui TypeScript project in `src/Werkr.Server/graph-ui/`. |
+| **Git** | Conventional commits are used for versioning via GitVersion. |
+
+---
+
+## Repository Structure
+
+```
+Werkr_Complete/
+├── .config/ # .NET local tools (GitVersion)
+├── .github/workflows/ # CI pipeline (ci.yml)
+├── docs/ # User-facing documentation, DocFX config, images
+├── scripts/ # Build and publish scripts
+├── src/
+│ ├── Werkr.Agent/ # Task execution worker
+│ ├── Werkr.Api/ # Application API
+│ ├── Werkr.AppHost/ # .NET Aspire orchestrator
+│ ├── Werkr.Common/ # Shared models, protos, auth
+│ ├── Werkr.Common.Configuration/ # Shared config classes
+│ ├── Werkr.Core/ # Business logic (scheduling, workflows, crypto)
+│ ├── Werkr.Data/ # EF Core contexts + entities
+│ ├── Werkr.Data.Identity/ # ASP.NET Identity EF Core contexts
+│ ├── Werkr.Server/ # Blazor Server UI + Identity
+│ ├── Werkr.ServiceDefaults/ # Aspire service defaults
+│ ├── Installer/Msi/ # WiX MSI projects + custom actions
+│ └── Test/
+│ ├── Werkr.Tests/ # API integration tests (Testcontainers)
+│ ├── Werkr.Tests.Agent/ # Agent end-to-end tests
+│ ├── Werkr.Tests.Data/ # Data layer unit tests
+│ └── Werkr.Tests.Server/ # Server integration tests (bunit)
+├── Directory.Build.props # Shared build properties (net10.0, nullable, etc.)
+├── Directory.Packages.props # Central package management
+├── GitVersion.yml # Versioning configuration
+├── global.json # SDK version pinning
+├── docker-compose.yml # Docker Compose for local development
+└── Werkr.slnx # Solution file
+```
+
+See [Architecture.md](Architecture.md) for project roles and the communication model.
+
+---
+
+## Building
+
+Build the entire solution:
+
+```shell
+dotnet build Werkr.slnx
+```
+
+> **Note:** The WiX installer projects (`src/Installer/Msi/`) require the WiX Toolset and only build on Windows. They are excluded from the default build on other platforms. If WiX is not installed, you can skip them with `dotnet build Werkr.slnx --no-restore /p:ExcludeWixProjects=true` or simply ignore the warning.
+
+### Central Package Management
+
+All NuGet package versions are managed centrally in `Directory.Packages.props`. Individual project files reference packages without specifying versions. To add or update a dependency, edit `Directory.Packages.props`.
+
+### Build Properties
+
+`Directory.Build.props` applies to all projects:
+- Target framework: `net10.0`
+- Nullable reference types: enabled
+- Implicit usings: enabled
+- XML documentation generation: enabled
+- Warnings as errors: enabled
+- Deterministic builds with embedded debug symbols
+- Locked-mode package restore (`RestorePackagesWithLockFile`)
+
+---
+
+## Running Locally
+
+The easiest way to run all components locally is with the .NET Aspire AppHost:
+
+```shell
+dotnet run --project src/Werkr.AppHost
+```
+
+This starts PostgreSQL (in a Docker container), creates two databases (`werkrdb` and `werkridentitydb`), and launches the API, Agent, and Server with proper service discovery. The Aspire dashboard opens automatically in your browser.
+
+See `src/Werkr.AppHost/AppHost.cs` for the orchestration configuration.
+
+### Docker Compose
+
+Alternatively, you can use `docker-compose.yml` at the repository root to run the full stack in containers. See `scripts/docker-build.ps1` for the Docker build workflow.
+
+---
+
+## Testing
+
+Werkr has five test surfaces: four .NET test projects under `src/Test/` and a TypeScript test suite in `src/Werkr.Server/graph-ui/`.
+
+Run all .NET tests:
+
+```shell
+dotnet test Werkr.slnx
+```
+
+Run graph-ui tests:
+
+```shell
+npm test --prefix src/Werkr.Server/graph-ui
+```
+
+> **Prerequisites:** Docker must be running for integration tests (Testcontainers). Node.js 22+ is required for graph-ui tests.
+
+For full details — test project scopes, AppHostFixture pattern, bunit component testing, Vitest configuration, CI pipeline steps, VS Code tasks, and test infrastructure — see [Testing.md](articles/Testing.md).
+
+---
+
+## Database Migrations
+
+EF Core migrations are split by database provider.
+
+| Context | Provider | Migration directory |
+|---------|----------|-------------------|
+| `PostgresWerkrDbContext` | PostgreSQL | `src/Werkr.Data/Migrations/Postgres/` |
+| `SqliteWerkrDbContext` | SQLite | `src/Werkr.Data/Migrations/Sqlite/` |
+| `PostgresWerkrIdentityDbContext` | PostgreSQL | `src/Werkr.Data.Identity/Migrations/Postgres/` |
+| `SqliteWerkrIdentityDbContext` | SQLite | `src/Werkr.Data.Identity/Migrations/Sqlite/` |
+
+VS Code tasks are available for generating new migrations — check `.vscode/tasks.json` for the `ef:migrations:postgres`, `ef:migrations:sqlite`, and `ef:migrations:identity` tasks.
+
+---
+
+## DocFX Documentation
+
+The project website ([docs.werkr.app](https://docs.werkr.app)) is generated with DocFX from `docs/docfx/`.
+
+To build the documentation locally:
+
+```shell
+# Install DocFX (if not already installed)
+dotnet tool install -g docfx
+
+# Generate API metadata
+docfx metadata docs/docfx/docfx.json
+
+# Build the site
+docfx build docs/docfx/docfx.json
+
+# Serve locally for preview
+docfx serve docs/docfx/_site
+```
+
+See [How To: Local Doc Development](articles/HowTo/LocalDocDev.md) for a more detailed walkthrough.
+
+---
+
+## Coding Conventions
+
+- **Formatting** — Defined in `.editorconfig`. Run `dotnet format Werkr.slnx` to auto-format.
+- **Nullable reference types** — Enabled project-wide. All new code should handle nullability correctly.
+- **Warnings as errors** — All compiler warnings are treated as errors. Fix warnings before committing.
+- **XML documentation** — Required for all public types and members (`GenerateDocumentationFile` is enabled).
+- **Conventional commits** — Use [Conventional Commits](https://www.conventionalcommits.org/) for commit messages. GitVersion derives version numbers from commit history: `feat:` = minor bump, `fix:` = patch bump, breaking changes = major bump.
+- **Package lock files** — `RestorePackagesWithLockFile` is enabled. Run `dotnet restore` to update `packages.lock.json` when dependencies change. CI restores with `--locked-mode`.
+
+---
+
+## Contribution Workflow
+
+1. **Fork** the repository and create a feature branch from `develop`.
+2. Make your changes following the coding conventions above.
+3. Run `dotnet format Werkr.slnx` and `dotnet test Werkr.slnx` before pushing.
+4. Submit a **pull request** targeting `develop`.
+5. All tests must pass in CI before the PR can be merged.
+6. You will need to agree to the [Contribution License Agreement](ContributionLicenseAgreement.md) before your PR is merged.
+
+For feedback, feature requests, bug reports, and documentation improvements, please open a [GitHub issue](https://github.com/DarkgreyDevelopment/Werkr.App/issues/new/choose).
diff --git a/docs/OpenSource.md b/docs/OpenSource.md
index 53b9978..4e0d06f 100644
--- a/docs/OpenSource.md
+++ b/docs/OpenSource.md
@@ -1,118 +1,152 @@
-# Open Source Acknowledgements & Thank Yous:
+# Open Source Acknowledgements & Thank Yous
-## Dotnet:
-The Werkr project is primarily C# based and utilizes the open source [dotnet](https://dotnet.microsoft.com) 7 software framework.
+> The authoritative source for exact package versions is `Directory.Packages.props` in the repository root.
-
+## .NET Platform
+
+The Werkr project is built on the open source [.NET 10](https://dotnet.microsoft.com) platform.
+
+### .NET Runtime & Extensions
+Werkr runs on the [.NET runtime](https://github.com/dotnet/runtime) and its associated extensions (hosting, configuration, logging, dependency injection, resilience, service discovery).
+The .NET runtime is licensed under the [MIT License](https://github.com/dotnet/runtime/blob/main/LICENSE.TXT).
+
+### ASP.NET Core
+Werkr Server and API use [ASP.NET Core](https://github.com/dotnet/aspnetcore) and its associated extensions (Identity, SignalR, OpenAPI, JWT Bearer authentication).
+ASP.NET Core is licensed under an [MIT License](https://github.com/dotnet/aspnetcore/blob/main/LICENSE.txt).
+
+### .NET SDK
+Werkr is written, tested, and published using the [.NET SDK](https://github.com/dotnet/sdk).
+The .NET SDK is licensed under an [MIT License](https://github.com/dotnet/sdk/blob/main/LICENSE.TXT).
-### Dotnet Runtime & Dotnet Extensions:
-Werkr runs because of the [dotnet runtime](https://github.com/dotnet/runtime) and its associated extensions.
-The dotnet runtime is licensed under the [MIT License](https://github.com/dotnet/runtime/blob/main/LICENSE.TXT).
+### .NET Aspire
+Werkr uses [.NET Aspire](https://github.com/dotnet/aspire) for local development orchestration and service defaults (hosting, health checks, service discovery, resilience).
+.NET Aspire is licensed under an [MIT License](https://github.com/dotnet/aspire/blob/main/LICENSE.TXT).
-### Dotnet AspNetCore & AspNetCore Extensions:
-Werkr Server utilizes [dotnet aspnetcore](https://github.com/dotnet/aspnetcore) and its associated extensions.
-AspNetCore is licensed under an [MIT License](https://github.com/dotnet/aspnetcore/blob/main/LICENSE.txt)
+## Build Process & Installers
+
+### GitVersion & Conventional Commits
+Werkr uses [GitVersion](https://gitversion.net/) for semver-based versioning and [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/#specification) for commit-driven version bumps.
+GitVersion is licensed under an [MIT License](https://github.com/GitTools/GitVersion/blob/main/LICENSE). The Conventional Commits specification is licensed under [Creative Commons CC BY 3.0](https://creativecommons.org/licenses/by/3.0/).
+
+### WiX Toolset
+The Werkr Server and Agent MSI installers are built with the [WiX Toolset](https://wixtoolset.org/) (including WiX UI and Util extensions). The installer implements a WiX custom action for deploying files and retrieving install parameters.
+The WiX Toolset is licensed under the [Microsoft Reciprocal License (MS-RL)](https://github.com/wixtoolset/wix/blob/develop/LICENSE.TXT).
-### Dotnet SDK:
-Werkr is written, tested, and published using the [dotnet SDK](https://github.com/dotnet/sdk).
-The Dotnet SDK is licensed under an [MIT License](https://github.com/dotnet/sdk/blob/main/LICENSE.TXT).
+## Database
-
+### Entity Framework Core
+Werkr uses [Entity Framework Core](https://github.com/dotnet/efcore) as its database abstraction layer, including the SQLite, Npgsql (PostgreSQL), and InMemory providers. Design-time tooling is used for migrations.
+EF Core is licensed under an [MIT License](https://github.com/dotnet/efcore/blob/main/LICENSE.txt).
-## Build Process & Installers:
+### EFCore.NamingConventions
+Werkr uses [EFCore.NamingConventions](https://github.com/efcore/EFCore.NamingConventions) for snake_case column naming in PostgreSQL.
+Licensed under an [Apache License, version 2.0](https://github.com/efcore/EFCore.NamingConventions/blob/main/LICENSE).
-### GitVersion & Conventional Commits:
-All of the Werkr code repos utilize [GitVersion](https://gitversion.net/) to version releases and artifacts. Semver style versioning occurs upon commit using the gitversion and [conventional commits](https://www.conventionalcommits.org/en/v1.0.0/#specification).
-GitVersion is licensed under an [MIT License](https://github.com/GitTools/GitVersion/blob/main/LICENSE). The conventional commit specification is licensed under an [Creative Commons - CC BY 3.0](https://creativecommons.org/licenses/by/3.0/) license.
+### SQLite
+[SQLite](https://www.sqlite.org) is in the public domain. The Agent uses SQLite-compatible encrypted SQLite databases.
-
+### PostgreSQL
+[PostgreSQL](https://www.postgresql.org) is licensed under the [PostgreSQL License](https://www.postgresql.org/about/licence/).
-### Wix ToolSet:
-The Werkr Server and Agent msi installers are generated utilizing the [Wix Toolset](https://wixtoolset.org/) and the Werkr Installer implements a wix custom action to deploy files and retrieve install parameters during installation.
-The Wix Toolset has been licensed under an [Microsoft Reciprocal License (MS-RL)](https://github.com/wixtoolset/wix/blob/develop/LICENSE.TXT)
+### Npgsql
+Werkr uses [Npgsql](https://github.com/npgsql/npgsql) as the .NET data provider for PostgreSQL, and [Npgsql.EntityFrameworkCore.PostgreSQL](https://github.com/npgsql/efcore.pg) for EF Core integration.
+Npgsql is licensed under the [PostgreSQL License](https://github.com/npgsql/npgsql/blob/main/LICENSE).
-### Dpkg-Deb:
-The Werkr Server and Agent debian installer packages are generated using the Debian package archive [dpkg-deb](https://manpages.ubuntu.com/manpages/trusty/man1/dpkg-deb.1.html) utility.
-Dpkg-Deb is part of the dpkg management system which is release under public-domain-md5, public-domain-s-s-d, BSD-2-clause, GPL-2, and GPL-2+ licenses.
+## Logging & Telemetry
+
+### Serilog
+Werkr uses [Serilog](https://github.com/serilog/serilog) for structured logging, with sinks for console output, file output, and OpenTelemetry export.
+Serilog is licensed under an [Apache License, version 2.0](https://github.com/serilog/serilog/blob/dev/LICENSE).
-
+### OpenTelemetry
+Werkr uses [OpenTelemetry .NET](https://github.com/open-telemetry/opentelemetry-dotnet) for metrics, traces, and logging instrumentation.
+OpenTelemetry .NET is licensed under an [Apache License, version 2.0](https://github.com/open-telemetry/opentelemetry-dotnet/blob/main/LICENSE).
+
-## Database:
+## Communication
-### Microsoft.EntityFrameworkCore:
-Werkr utilizes [EntityFrameWorkCore](https://github.com/dotnet/efcore) for its database abstraction layer.
-EFCore is licensed under an [MIT License](https://github.com/dotnet/efcore/blob/main/LICENSE.txt).
+### gRPC
+Werkr uses [gRPC for .NET](https://github.com/grpc/grpc-dotnet) (`Grpc.AspNetCore`, `Grpc.Net.Client`, `Grpc.Net.ClientFactory`) and [gRPC Tools](https://github.com/grpc/grpc) for protobuf code generation.
+gRPC is licensed under an [Apache License, version 2.0](https://github.com/grpc/grpc/blob/master/LICENSE).
-
+### Google Protobuf
+Werkr uses [Google.Protobuf](https://github.com/protocolbuffers/protobuf) for protocol buffer serialization.
+Protobuf is licensed under a [BSD 3-Clause License](https://github.com/protocolbuffers/protobuf/blob/main/LICENSE).
-### Sqlite:
-[Sqlite](https://www.sqlite.org) - has been released under the public domain.
+### SignalR
+Werkr Server uses [ASP.NET Core SignalR](https://github.com/dotnet/aspnetcore) for real-time UI communication.
+Licensed under an [MIT License](https://github.com/dotnet/aspnetcore/blob/main/LICENSE.txt) as part of ASP.NET Core.
-### PostgreSQL:
-[PostgreSQL](https://www.postgresql.org) - PostgreSQL is licensed under the [PostgreSQL license](https://www.postgresql.org/about/licence/).
-
-
+## Agent Packages
-## Logging & Telemetry:
+### PowerShell SDK
+The Werkr Agent hosts PowerShell using the [Microsoft.PowerShell.SDK](https://www.nuget.org/packages/Microsoft.PowerShell.SDK/).
+PowerShell is licensed under an [MIT License](https://github.com/PowerShell/PowerShell/blob/master/LICENSE.txt).
-### OpenTelemetry:
-Werkr utilizes [OpenTelemetry](https://opentelemetry.io) ([dotnet](https://github.com/open-telemetry/opentelemetry-dotnet)) for telemetry/trace logging.
-OpenTelemetry dotnet is licensed under an [Apache License, version 2.0](https://github.com/open-telemetry/opentelemetry-dotnet/blob/main/LICENSE).
+
-### Log4Net:
-Werkr utilizes [log4net](https://logging.apache.org/log4net/) for logging purposes.
-Log4net is licensed under the [Apache License, version 2.0](https://www.apache.org/licenses/LICENSE-2.0).
+## Security & Identity
-
+### ASP.NET Core Identity
+Werkr Server uses ASP.NET Core Identity for user management, authentication, and role-based authorization.
+Licensed under an [MIT License](https://github.com/dotnet/aspnetcore/blob/main/LICENSE.txt) as part of ASP.NET Core.
-## Server & Agent packages:
+### QRCoder
+Werkr uses [QRCoder](https://github.com/codebude/QRCoder) for generating TOTP 2FA QR codes.
+QRCoder is licensed under an [MIT License](https://github.com/codebude/QRCoder/blob/master/LICENSE.txt).
-### Grpc:
-The Werkr Server and Agent utilize [grpc](https://github.com/grpc/grpc), specifically the [Grpc.Core](https://www.nuget.org/packages/Grpc.Core) and [Grpc.Tools](https://www.nuget.org/packages/Grpc.Tools) packages, for communications.
-Grpc is licensed under an [Apache License, version 2.0](https://github.com/grpc/grpc/blob/master/LICENSE).
+### System.Security.Cryptography.ProtectedData
+Werkr uses the [ProtectedData](https://www.nuget.org/packages/System.Security.Cryptography.ProtectedData) package for Windows DPAPI secret storage.
+Licensed under an [MIT License](https://github.com/dotnet/runtime/blob/main/LICENSE.TXT) as part of the .NET runtime.
-### Grpc Dotnet:
-The Werkr Server and Agent utilize [grpc-dotnet](https://github.com/grpc/grpc-dotnet), in the form of the [Grpc.Net.Client](https://www.nuget.org/packages/Grpc.Net.Client), [Grpc.AspNetCore](https://www.nuget.org/packages/Grpc.AspNetCore), & [Grpc.Core.Api](https://www.nuget.org/packages/Grpc.Core.Api/) packages.
-Grpc-Dotnet is licensed under an [Apache License, version 2.0](https://github.com/grpc/grpc-dotnet/blob/master/LICENSE)
+## Utilities
+
+### TimeZoneNames
+Werkr uses [TimeZoneNames](https://github.com/mattjohnsonpint/TimeZoneNames) for human-readable time zone display names in scheduling.
+TimeZoneNames is licensed under an [MIT License](https://github.com/mattjohnsonpint/TimeZoneNames/blob/main/LICENSE).
-### PowerShell Sdk:
-The Werkr Agent hosts PowerShell utilizing the [Microsoft.PowerShell.SDK](https://www.nuget.org/packages/Microsoft.PowerShell.SDK/).
-PowerShell is licensed under an [MIT License](https://github.com/PowerShell/PowerShell/blob/master/LICENSE.txt)
+## Testing
-
+### MSTest
+Werkr uses [MSTest](https://github.com/microsoft/testfx) as its test framework with the Microsoft.Testing.Platform runner.
+MSTest is licensed under an [MIT License](https://github.com/microsoft/testfx/blob/main/LICENSE).
-## Documentation & Hosting
+### Testcontainers
+Werkr uses [Testcontainers for .NET](https://github.com/testcontainers/testcontainers-dotnet) (specifically the PostgreSQL module) for integration testing with disposable database containers.
+Testcontainers is licensed under an [MIT License](https://github.com/testcontainers/testcontainers-dotnet/blob/develop/LICENSE).
-### DocFX:
-[docs.werkr.app](https://docs.werkr.app) utilizes [DocFX](https://dotnet.github.io/docfx) to generate public documentation.
-DocFX is licensed under the [MIT license](https://github.com/dotnet/docfx/blob/main/LICENSE)
+### ASP.NET Core Mvc.Testing
+Werkr uses `Microsoft.AspNetCore.Mvc.Testing` for in-process API testing via `WebApplicationFactory`.
+Licensed under an [MIT License](https://github.com/dotnet/aspnetcore/blob/main/LICENSE.txt) as part of ASP.NET Core.
-### DocFX Theme - DarkFX:
-[docs.werkr.app](https://docs.werkr.app) utilizes a modified version of the [darkfx](https://github.com/steffen-wilke/darkfx) DocFX theme/template.
-DarkFX is licensed under the [MIT license](https://github.com/steffen-wilke/darkfx/blob/master/LICENSE)
+## Documentation & Hosting
-
+### DocFX
+[docs.werkr.app](https://docs.werkr.app) is generated with [DocFX](https://dotnet.github.io/docfx).
+DocFX is licensed under the [MIT License](https://github.com/dotnet/docfx/blob/main/LICENSE).
-### Cascadia Code Font:
-[docs.werkr.app](https://docs.werkr.app) utilizes the Cascadia Code font.
-Cascadia Code is licensed under the [SIL OPEN FONT LICENSE](https://github.com/microsoft/cascadia-code/blob/main/LICENSE).
+### DarkFX Theme
+[docs.werkr.app](https://docs.werkr.app) uses a modified version of the [DarkFX](https://github.com/steffen-wilke/darkfx) DocFX theme.
+DarkFX is licensed under the [MIT License](https://github.com/steffen-wilke/darkfx/blob/master/LICENSE).
-
+### Cascadia Code Font
+[docs.werkr.app](https://docs.werkr.app) uses the [Cascadia Code](https://github.com/microsoft/cascadia-code) font.
+Cascadia Code is licensed under the [SIL Open Font License](https://github.com/microsoft/cascadia-code/blob/main/LICENSE).
-### GitHub Pages:
-[docs.werkr.app](https://docs.werkr.app) is hosted for free by [github pages](https://pages.github.com/). Thank you [github](https://github.com/)!.
\ No newline at end of file
+### GitHub Pages
+[docs.werkr.app](https://docs.werkr.app) is hosted by [GitHub Pages](https://pages.github.com/). Thank you, GitHub!
diff --git a/docs/SECURITY.md b/docs/SECURITY.md
index 8b4719d..92b4ba0 100644
--- a/docs/SECURITY.md
+++ b/docs/SECURITY.md
@@ -1,17 +1,19 @@
-# Security Policy
-
-## Supported Versions
-| Version | Supported |
-| ------- | ------------------ |
-| 1.x.x | :white_check_mark: |
-| < 1.0 | :x: |
-
-## Reporting a Vulnerability
-
-Please report vulnerabilities to [security@darkgrey.dev](mailto:security@darkgrey.dev).
-Alternatively you may also open up
-[an issue](https://github.com/DarkgreyDevelopment/Werkr.App/issues) on github.
-
-You should receive a response, within a week, through the same channel that you reported the vulnerability.
-
-If you'd like to propose a solution to the vulnerability you are also welcome to [contribute](https://docs.werkr.app/index.html#contributing)!
+# Security Policy
+
+## Supported Versions
+
+| Version | Supported |
+| -------------- | ------------------ |
+| Latest release | :white_check_mark: |
+| Pre-release | :warning: |
+| Previous | :x: |
+
+## Reporting a Vulnerability
+
+Please report vulnerabilities to [security@darkgrey.dev](mailto:security@darkgrey.dev).
+Alternatively you may also open
+[an issue](https://github.com/DarkgreyDevelopment/Werkr.App/issues) on GitHub.
+
+You should receive a response, within a week, through the same channel that you reported the vulnerability.
+
+If you'd like to propose a solution to the vulnerability you are also welcome to [contribute](https://docs.werkr.app/index.html#contributing)!
diff --git a/docs/api/.gitignore b/docs/api/.gitignore
deleted file mode 100644
index f798527..0000000
--- a/docs/api/.gitignore
+++ /dev/null
@@ -1,5 +0,0 @@
-###############
-# temp file #
-###############
-*.yml
-.manifest
diff --git a/docs/articles/FeatureList.md b/docs/articles/FeatureList.md
deleted file mode 100644
index 86a0f2d..0000000
--- a/docs/articles/FeatureList.md
+++ /dev/null
@@ -1,139 +0,0 @@
-Werkr Project 1.0 Intended Feature List. This document is aspirational at this time.
-
-## 1. Streamlined Task Management:
-- Predefine tasks or create one-off/ad-hoc tasks that run immediately or on a schedule.
- - Allow users to set start dates, maximum running time length, and end times, for non-workflow defined tasks
- - Enable users to create multiple tasks and link them together into a workflow with simple DAG visualizations for more comprehensive task scheduling.
-- Create ad-hoc tasks that run immediately or at a prescheduled time or on a time interval.
- - Provides a user interface for creating and executing ad-hoc tasks
-- Create a "workflow" user interface that allows the user to create and link many different tasks together.
-- Limited webhook integration allows for task/workflow completion notification via popular project management or productivity tools (Slack, Discord, etc).
-
-
-
-## 2. The product has two primary components/applications: a Server and an Agent
-- Supported on Windows 10+ and Debian Linux (with systemd) based platforms
- - There are MSI installers for the windows releases of each app
- - There are .deb installers for the debian linux release of each app.
- - There are portable editions of the application available as well.
- - There is no difference between the portable edition of the app and the installed version.
-- Both server and agent applications support x64 and arm64 CPU architectures
-- MacOS support is planned after the .NET 8 release in November 2023
- - Dotnet 8 will provide ALPN support to macos based platforms!
-
-
-
-## 3. Workflow-Centric Design:
-- Directed Acyclic Graph (DAG) visualizations
- - Implements an intuitive UI to display DAGs and current workflow states
- - There is a "workflow" view that only shows a single workflow
- - There is a "system" view that shows all workflows and isolated tasks.
- - Enable users to modify workflows via by modifying the DAG visualization.
- - When in workflow editing mode you can click a button to draw links between tasks and other workflows.
- - Once a link has been created you will receive be prompted on how to handle inputs, outputs, and exceptions.
- - Sensible defaults and intuitive options make workflow configuration simple.
-- The Workflow model allows for expanded and advanced capabilities
- - The software provides built-in branching logic, iteration, and exception handling based on task/workflow outputs and state.
-
-
-
-## 4. Schedulable Tasks:
-- Run tasks inside or outside of a workflow
- - Implements a UI for creating and scheduling tasks.
- - Tasks outside of a workflow can only be triggered "immediately", based on a schedule, or on a time interval.
- - Meaning that tasks outside a workflow cannot be triggered by filewatch, outside task completion, or workflow completion.
-
-
-
-## 5. Flexible Task Triggers:
-- FileWatch (Poll FileWatch, Filesystem Event FileWatch)
- - Implement a FileWatch component for triggering tasks based on file events
- - Available as a workflow trigger, as well as a trigger for tasks within a workflow.
-- DateTime
- - A scheduler component triggers tasks based on specified DateTime
- - Available for all tasks types and workflows.
-- Interval/Cyclical (periodicity)
- - A scheduler component triggers tasks based on starting intervals
- - Available for all tasks types and workflows.
-- Task Completion States
- - Tasks may be triggered based on the completion state of other tasks within the same worfklow.
- - Available for use by tasks within a workflow.
-- Workflow Completion State
- - Tasks and workflows may be triggered based on the operating state of outside workflows.
- - Available for use by tasks within a workflow, as well as initial workflow triggering.
-
-
-
-## 6. Versatile Task Types:
-- System-defined tasks
- - Contains a library of system-defined tasks with required and optional input parameters
- - File/Directory Creation
- - Move/Copy files and/or directories
- - Delete file and/or directories
- - Test file/directory exists
- - Write Content to file
-- User-defined tasks
- - Create a UI for building user-defined tasks.
- - User defined tasks are a linear sequence of system-defined tasks, PowerShell scripts, and native command executions
-- PowerShell Script Execution Tasks
-- PowerShell Command Execution Tasks
-- System Shell Command Execution Tasks
-
-
-
-## 7. Task Outputs:
-- Standard PowerShell Outputs
- - PowerShell Scripts and Command Execution Tasks share the same possible outputs:
- - LastSuccess
- - LastExitCode
- - LastExitCode will be set prior to script execution.
- - Initial LastExitCode can be set to any positive number.
- - Terminating Exception information will be returned (if applicable).
- - Output stream content will be returned as an array of objects.
- - Error stream content will be returned as an array of ErrorRecord objects.
- - Debug, Verbose, and Warning stream content (if any) will be returned as arrays of strings.
-- System Shell Command Execution Tasks
- - Returns the process exit code after command execution.
-- System-defined tasks
- - Returns the task success status as a [boolean?]. Return output will also contain [Exception?] information (if applicable).
-
-
-
-## 8. Server and Agent Configuration:
-- Both the server is a C# Kestrel webservers that can be configured to access an external (Postgres?) or built-in SQLite database
-- The Agent is a C# worker process that hosts PowerShell and can create operating-system shells (predfined by OS).
-- The server and agent use grpc for inter-process communications.
-
-
-
-## 9. Security:
-- Access Control
- - Implements role-based access control and standard authentication mechanisms.
- - Manage users roles and permissions to restrict or allow access to certain features or data.
-- Native 2FA support (TOTP)
-- Implements simple TLS certificate configuration for both the webserver and agent components.
-
-
-
-## 10. Licensing and Support:
-- The applications are offered free of charge under an MIT license.
-- Best effort support and triage is provided via a GitHub issue process
-- Documentation, tutorials, and other resources are available to help users get started and troubleshoot issues
-
-
-
-## 11. Community Contributions:
-- Open collaboration
- - Community membership and collaboration is encouraged! Please feel free to help with documentation, bug fixes, and new features
- - Please help maintain a welcoming and inclusive environment for all contributors.
-- The project has been broken out into multiple separate repositories allowing for focused contributions.
-
-
-
-## 12. Extensibility and Built-in Utility:
-- The werkr project is designed to be extensible but with enough built-in utility to minimize the need for most extensions.
- - Contains a comprehensive library of built-in tasks to cover a wide range of use cases
- - User-defined tasks allow for consistent implementation of commonly performed operations.
- - You have the full utility of PowerShell and the built in system shell at your disposal.
-- This product is aimed at users with moderate computer knowledge.
- - You shouldn't have to be a computer expert to create expert level workflows.
diff --git a/docs/articles/HowTo/LinuxAgentInstall.md b/docs/articles/HowTo/LinuxAgentInstall.md
index a8f356f..70ac249 100644
--- a/docs/articles/HowTo/LinuxAgentInstall.md
+++ b/docs/articles/HowTo/LinuxAgentInstall.md
@@ -1,3 +1,13 @@
-More information will be available soon!
+# Linux Agent Installation
-Step 1: Download the application from the [github releases](https://github.com/DarkgreyDevelopment/Werkr.Agent/releases/tag/latest) page.
\ No newline at end of file
+On Linux, deploy the Werkr Agent using the portable archive. A `.deb` installer is planned for a future release.
+
+To run the Werkr Agent on Linux:
+
+1. Download the latest portable release from the [GitHub releases](https://github.com/DarkgreyDevelopment/Werkr.App/releases/latest) page (select the Linux x64 or arm64 archive).
+2. Extract the archive to your preferred installation directory.
+3. Configure `appsettings.json` with your TLS certificate, working directory, and allowed hosts settings.
+4. Register the application as a systemd service, or run it directly.
+5. Complete the agent registration process by importing the admin bundle from your Server. See [Architecture.md](../../Architecture.md#registration-flow) for details.
+
+For building from source and detailed configuration, see [Development.md](../../Development.md).
diff --git a/docs/articles/HowTo/LinuxServerInstall.md b/docs/articles/HowTo/LinuxServerInstall.md
index 4076f12..5768a56 100644
--- a/docs/articles/HowTo/LinuxServerInstall.md
+++ b/docs/articles/HowTo/LinuxServerInstall.md
@@ -1,3 +1,12 @@
-More information will be available soon!
+# Linux Server Installation
-Step 1: Download the application from the [github releases](https://github.com/DarkgreyDevelopment/Werkr.Server/releases/tag/latest) page.
\ No newline at end of file
+On Linux, deploy the Werkr Server using the portable archive. A `.deb` installer is planned for a future release.
+
+To run the Werkr Server on Linux:
+
+1. Download the latest portable release from the [GitHub releases](https://github.com/DarkgreyDevelopment/Werkr.App/releases/latest) page (select the Linux x64 or arm64 archive).
+2. Extract the archive to your preferred installation directory.
+3. Configure `appsettings.json` with your TLS certificate, database connection, and allowed hosts settings.
+4. Register the application as a systemd service, or run it directly.
+
+For building from source and detailed configuration, see [Development.md](../../Development.md).
diff --git a/docs/articles/HowTo/LocalDocDev.md b/docs/articles/HowTo/LocalDocDev.md
index eb9f70f..5bb65ae 100644
--- a/docs/articles/HowTo/LocalDocDev.md
+++ b/docs/articles/HowTo/LocalDocDev.md
@@ -1,57 +1,58 @@
# Local Documentation Development
-[docs.werkr.app](https://docs.werkr.app) is hosted on github pages and is generated using [docfx](https://dotnet.github.io/docfx/) from markdown pages housed in the [github repository](https://Werkr.App/tree/main/docs).
-Docfx also generates the API documentation based on the XML documentation in the [code](https://main.cloud-sharesync.com/src) itself.
-You can test what documentation changes will look like locally prior to pushing any commits to github.
-To do so you must emulate the [github action](https://Werkr.App/blob/main/.github/workflows/DocFX_gh-pages.yml) sequence.
+[docs.werkr.app](https://docs.werkr.app) is hosted on GitHub Pages and generated using [DocFX](https://dotnet.github.io/docfx/) from the markdown pages and XML documentation in this repository.
-This process can be done on windows using powershell 7+ by issuing the following commands:
-```powershell
-# 1. Download a copy of DocFX and extract it.
-$IWRParams = @{
- Uri = 'https://github.com/dotnet/docfx/releases/download/v2.59.4/docfx.zip'
- OutFile = './docfx.zip'
- Method = 'Get'
-}
-Invoke-WebRequest @IWRParams
-Expand-Archive -Path './docfx.zip' -DestinationPath './docfx'
+You can preview documentation changes locally before pushing commits.
-# 2. Clone the Werkr.App Repo locally into a Werkr.App directory.
-git clone https://git.werkr.app Werkr.App
+---
-# 3. change directory to the cloned repository
-Set-Location './Werkr.App'
+## Prerequisites
-# 4. Clone the common repo into the Werkr.App/src/Werkr.Common directory.
-git clone https://git.common.werkr.app src/Werkr.Common
+- [.NET 10 SDK](https://dotnet.microsoft.com/download) (already required for the project — see `global.json`)
+- [DocFX](https://dotnet.github.io/docfx/) — install as a global tool:
-# 5. Clone the common configuration repo into the Werkr.App/src/Werkr.Common.Configuration directory.
-git clone https://git.commonconfiguration.werkr.app src/Werkr.Common.Configuration
+```powershell
+dotnet tool install -g docfx
+```
-# 6. Clone the installers repo into the src/Werkr.Installers directory.
-git clone https://git.installers.werkr.app src/Werkr.Installers
+---
-# 7. Clone the Server repo Werkr.App/src/Werkr.Server directory.
-git clone https://server.werkr.app src/Werkr.Server
+## Building and Previewing Docs
-# 8. Clone the Agent repo into the Werkr.App/src/Werkr.Agent directory.
-git clone https://git.agent.werkr.app src/Werkr.Agent
+From the repository root, run the following commands:
-# 9. Manual File Copying.
-$CopyParams = @{
- Verbose = $true
- Force = $true
-}
-Copy-Item -Path './LICENSE' -Destination './docs/LICENSE.md' @CopyParams
-Copy-Item -Path './README.md' -Destination './docs/index.md' @CopyParams
-copy-Item -Path './docs/docfx/*' -Destination 'docs/' -Verbose -Exclude README.md -Recurse
+```powershell
+# 1. Copy required files into the docs directory (emulates the GitHub Actions workflow).
+Copy-Item -Path './LICENSE' -Destination './docs/LICENSE.md' -Force
+Copy-Item -Path './README.md' -Destination './docs/index.md' -Force
+Copy-Item -Path './docs/docfx/*' -Destination './docs/' -Exclude 'README.md' -Recurse -Force
-# 10 Generate API metadata.
-& '../docfx/docfx.exe' 'metadata' './docs/docfx.json'
+# 2. Generate API metadata from the source projects.
+docfx metadata docs/docfx.json
-# 11. Create the docfx site.
-& '../docfx/docfx.exe' './docs/docfx.json'
+# 3. Build the DocFX site.
+docfx build docs/docfx.json
-# 12. Serve the website.
-& '../docfx/docfx.exe' 'docs\docfx.json' -t 'templates/Werkr' --serve
+# 4. Serve the site locally for preview.
+docfx serve docs/_site
```
+
+The site will be available at `http://localhost:8080` by default.
+
+---
+
+## How It Works
+
+- **`docs/docfx/docfx.json`** defines which projects generate API metadata and which markdown files are included in the site build. The `metadata` section points to project files under `src/` and the `build` section pulls content from `docs/articles/`, `docs/api/`, and root markdown files.
+- **`docs/docfx/filterConfig.yml`** controls which types and members are included or excluded from the API documentation.
+- **`docs/docfx/templates/Werkr/`** contains the custom DocFX theme (based on DarkFX).
+- **`docs/articles/`** contains the user-facing documentation articles.
+- **`docs/images/`** contains screenshots and logos referenced by articles.
+
+---
+
+## Notes
+
+- The DocFX `src` path in `docfx.json` is relative to the `docfx.json` file location (`docs/docfx/`). The path `../../src` resolves to the repository's `src/` directory.
+- If you add a new project to the solution that should appear in API documentation, add its `.csproj` path to the `metadata[0].src.files` array in `docfx.json`.
+- The custom template in `templates/Werkr` overrides default DocFX styles. See the [DarkFX](https://github.com/steffen-wilke/darkfx) repository for the base theme.
diff --git a/docs/articles/HowTo/WindowsAgentInstall.md b/docs/articles/HowTo/WindowsAgentInstall.md
index c3909f1..05c3aeb 100644
--- a/docs/articles/HowTo/WindowsAgentInstall.md
+++ b/docs/articles/HowTo/WindowsAgentInstall.md
@@ -1,30 +1,33 @@
-This document is intended to show you the Werkr Agent Windows MSI installer process and highlight key details.
+This document walks you through the Werkr Agent Windows MSI installer process and highlights key details.
-To get started download the appropriate MSI file from the [github releases](https://github.com/DarkgreyDevelopment/Werkr.Agent/releases/tag/latest) page. Note that if you're not sure which msi file to download then you probably want the x64 version.
+To get started, download the appropriate MSI file from the [GitHub releases](https://github.com/DarkgreyDevelopment/Werkr.App/releases/latest) page. If you're not sure which MSI file to download, you probably want the x64 version.
# Msi Installation
-1. Double Click the Msi installer and select Next.
+1. Double Click the Msi installer and select Next.

-2. Specfy agent settings.
-\" with \"True\" or \"False\" options. The last option is the \"Allowed Hosts\" textbox, this has a pre-populated value of \"*\". Available progress buttons are \"Back\", \"Next\", and \"Cancel\".")
+2. Specify agent settings.
+
+
+* The `AGENTNAME` setting assigns a human-readable name to this Agent instance. This name is sent to the Server during registration and appears in the management UI.
+* The `AGENTGRPCPORT` setting specifies the gRPC port the Agent listens on for incoming Server connections (default: `5001`).
* The `Agent Working Directory` setting determines the default path for scripts and commands to start from.
-* The `Enable Agent to use PowerShell` setting determines whether the agent will enable the built in PowerShell host & its associated grpc communications channel.
-* The `Enable Agent to use System Shell (cmd)` setting determines whether the agent will enable the built in command process host & its associated grpc communications channel.
-* Note that if you turn both PowerShell and the System Shell services off then the agent will only be able to perform System Defined actions.
-* The `Allowed Hosts` settings determines which hosts are allowed to communicate with the agent.
- * Leave this as `*` to enable all outside servers to communicate with this agent.
+* The `Enable Agent to use PowerShell` setting determines whether the agent enables the built-in PowerShell host and its associated gRPC communications channel.
+* The `Enable Agent to use System Shell (cmd)` setting determines whether the agent enables the built-in command process host and its associated gRPC communications channel.
+* Note that if you turn both PowerShell and the System Shell services off, the agent will only be able to perform built-in Action tasks.
+* The `Allowed Hosts` setting determines which hosts are allowed to communicate with the agent.
+ * Leave this as `*` to allow all servers to connect.
* This list is semi-colon delimited. Ex: `example.com;localhost;192.168.1.16`
-3. Select your "Kestrel Certificate Config" type from the dropdown.
+3. Select your "Kestrel Certificate Config" type from the dropdown.
\". Available progress buttons are \"Back\", \"Next\", and \"Cancel\". The \"Next\" button is disabled and cannot be clicked.")
* Regardless of which certificate type you choose, you will need to enter the `Certificate Url` that you want the agent to listen on.
* Once populated, You may need to click out of the Certificate Url field for the "Next" button to become enabled.
@@ -35,9 +38,9 @@ To get started download the appropriate MSI file from the [github releases](http
CertStore (expand)
- 
- If you know your certificates store information then you can feel free to paste it into the fields.
- Otherwise select the browse button on the bottom left and you can select the appropriate certificate from the ones availabe in the store.
+ 
+ If you know your certificates store information then you can feel free to paste it into the fields.
+ Otherwise select the browse button on the bottom left and you can select the appropriate certificate from the ones availabe in the store.

@@ -50,8 +53,8 @@ To get started download the appropriate MSI file from the [github releases](http
CertFile (expand)
- 
-  file.")
+ 
+  file.")
@@ -62,9 +65,9 @@ To get started download the appropriate MSI file from the [github releases](http
CertAndKeyFile (expand)
- 
+ 
-  file. This image is also used as an example for the \"Select a certificate key file\" menu that appears when you select the KeyFile Path \"Browse\" button. The only difference between those two menus is the title bar and the pre-populate search extension (.key instead of .pfx).")
+  file. This image is also used as an example for the \"Select a certificate key file\" menu that appears when you select the KeyFile Path \"Browse\" button. The only difference between those two menus is the title bar and the pre-populate search extension (.key instead of .pfx).")
@@ -72,19 +75,19 @@ To get started download the appropriate MSI file from the [github releases](http
-4. Specify logging levels. It is suggested that you leave these at their default values unless you have a specific reason to change them.
-
-The Werkr project utilizes the Microsoft.Extensions.Logging package and uses "[LogLevel](https://learn.microsoft.com/en-us/dotnet/api/microsoft.extensions.logging.loglevel)" to determine what to output to the log. See the microsoft article [Logging in C# and .Net](https://learn.microsoft.com/en-us/dotnet/core/extensions/logging) for more details.
+4. Specify logging levels. It is suggested that you leave these at their default values unless you have a specific reason to change them.
+
+The Werkr project uses the Microsoft.Extensions.Logging package and uses "[LogLevel](https://learn.microsoft.com/en-us/dotnet/api/microsoft.extensions.logging.loglevel)" to determine what to output to the log. See the Microsoft article [Logging in C# and .NET](https://learn.microsoft.com/en-us/dotnet/core/extensions/logging) for more details.
-5. Select Install Path - You can choose any location you want the application to be installed at.
+5. Select Install Path - You can choose any location you want the application to be installed at.

-6. Select Install
-
+6. Select Install
+
The installer will now:
* Extract the portable application files
@@ -94,7 +97,7 @@ The installer will now:
-7. Installation Complete, Select Finish!
+7. Installation Complete, Select Finish!

@@ -110,8 +113,8 @@ The application has also been registered as a windows service.
Service Info (expand)
- 
- Interact with the service (start/stop/disable) via the Windows Services mmc snapin.
+ 
+ Interact with the service (start/stop/disable) via the Windows Services mmc snapin.
@@ -139,7 +142,7 @@ The application can be removed by selecting the `Uninstall` button from either t
Installed Apps (expand)
- 
+ 
The `uninstall` button in this menu is hidden until you select the elipses menu on the right side of the screen.
@@ -148,10 +151,10 @@ The application can be removed by selecting the `Uninstall` button from either t
-Please note that after uninstalling the application you may still have a `Werkr Agent` directory in the install location.
-
-This directory should only contain leftover log files that were generated by the application during its operation.
-You can feel free to delete this directory and its contents after the uninstall wizard has completed successfully.
+Please note that after uninstalling the application you may still have a `Werkr Agent` directory in the install location.
+
+This directory should only contain leftover log files that were generated by the application during its operation.
+You can feel free to delete this directory and its contents after the uninstall wizard has completed successfully.
diff --git a/docs/articles/HowTo/WindowsServerInstall.md b/docs/articles/HowTo/WindowsServerInstall.md
index 474693b..353ce35 100644
--- a/docs/articles/HowTo/WindowsServerInstall.md
+++ b/docs/articles/HowTo/WindowsServerInstall.md
@@ -1,26 +1,29 @@
-This document is intended to show you the Werkr Server Windows MSI installer process and highlight key details.
+This document walks you through the Werkr Server Windows MSI installer process and highlights key details.
-To get started download the appropriate MSI file from the [github releases](https://github.com/DarkgreyDevelopment/Werkr.Server/releases/tag/latest) page. Note that if you're not sure which msi file to download then you probably want the x64 version.
+To get started, download the appropriate MSI file from the [GitHub releases](https://github.com/DarkgreyDevelopment/Werkr.App/releases/latest) page. If you're not sure which MSI file to download, you probably want the x64 version.
# Msi Installation
-1. Double Click the Msi installer and select Next.
+1. Double Click the Msi installer and select Next.

-2. Specfy Server settings.
-
+2. Specify Server settings.
+
+
-* The `Allowed Hosts` settings determines which hosts are allowed to communicate with the Server.
- * Leave this as `*` to enable all outside clients and agents to communicate with this Server.
+* The `SERVERNAME` setting assigns a human-readable name to this Server instance. This name appears in the management UI and in agent registration.
+* The `ALLOWREGISTRATION` setting controls whether the Server will accept new agent registration bundles. Set to `True` during initial setup, then consider restricting it once your agents are registered.
+* The `Allowed Hosts` setting determines which hosts are allowed to communicate with the Server.
+ * Leave this as `*` to allow all clients and agents to connect.
* This list is semi-colon delimited. Ex: `example.com;localhost;192.168.1.16`
-3. Select your "Kestrel Certificate Config" type from the dropdown.
+3. Select your "Kestrel Certificate Config" type from the dropdown.
\". Available progress buttons are \"Back\", \"Next\", and \"Cancel\". The \"Next\" button is disabled and cannot be clicked.")
* Regardless of which certificate type you choose, you will need to enter the `Certificate Url` that you want the Server to listen on.
* Once populated, You may need to click out of the Certificate Url field for the "Next" button to become enabled.
@@ -31,9 +34,9 @@ To get started download the appropriate MSI file from the [github releases](http
CertStore (expand)
- 
- If you know your certificates store information then you can feel free to paste it into the fields.
- Otherwise select the browse button on the bottom left and you can select the appropriate certificate from the ones availabe in the store.
+ 
+ If you know your certificates store information then you can feel free to paste it into the fields.
+ Otherwise select the browse button on the bottom left and you can select the appropriate certificate from the ones availabe in the store.

@@ -46,8 +49,8 @@ To get started download the appropriate MSI file from the [github releases](http
CertFile (expand)
- 
-  file.")
+ 
+  file.")
@@ -58,9 +61,9 @@ To get started download the appropriate MSI file from the [github releases](http
CertAndKeyFile (expand)
- 
+ 
-  file. This image is also used as an example for the \"Select a certificate key file\" menu that appears when you select the KeyFile Path \"Browse\" button. The only difference between those two menus is the title bar and the pre-populate search extension (.key instead of .pfx).")
+  file. This image is also used as an example for the \"Select a certificate key file\" menu that appears when you select the KeyFile Path \"Browse\" button. The only difference between those two menus is the title bar and the pre-populate search extension (.key instead of .pfx).")
@@ -68,19 +71,19 @@ To get started download the appropriate MSI file from the [github releases](http
-4. Specify logging levels. It is suggested that you leave these at their default values unless you have a specific reason to change them.
-
-The Werkr project utilizes the Microsoft.Extensions.Logging package and uses "[LogLevel](https://learn.microsoft.com/en-us/dotnet/api/microsoft.extensions.logging.loglevel)" to determine what to output to the log. See the microsoft article [Logging in C# and .Net](https://learn.microsoft.com/en-us/dotnet/core/extensions/logging) for more details.
+4. Specify logging levels. It is suggested that you leave these at their default values unless you have a specific reason to change them.
+
+The Werkr project uses the Microsoft.Extensions.Logging package and uses "[LogLevel](https://learn.microsoft.com/en-us/dotnet/api/microsoft.extensions.logging.loglevel)" to determine what to output to the log. See the Microsoft article [Logging in C# and .NET](https://learn.microsoft.com/en-us/dotnet/core/extensions/logging) for more details.
-5. Select Install Path - You can choose any location you want the application to be installed at.
+5. Select Install Path - You can choose any location you want the application to be installed at.

-6. Select Install
-
+6. Select Install
+
The installer will now:
* Extract the portable application files
@@ -90,7 +93,7 @@ The installer will now:
-7. Installation Complete, Select Finish!
+7. Installation Complete, Select Finish!

@@ -106,8 +109,8 @@ The application has also been registered as a windows service.
Service Info (expand)
- 
- Interact with the service (start/stop/disable) via the Windows Services mmc snapin.
+ 
+ Interact with the service (start/stop/disable) via the Windows Services mmc snapin.
@@ -135,7 +138,7 @@ The application can be removed by selecting the `Uninstall` button from either t
Installed Apps (expand)
- 
+ 
The `uninstall` button in this menu is hidden until you select the elipses menu on the right side of the screen.
@@ -144,10 +147,10 @@ The application can be removed by selecting the `Uninstall` button from either t
-Please note that after uninstalling the application you may still have a `Werkr Server` directory in the install location.
-
-This directory should only contain leftover log files that were generated by the application during its operation.
-You can feel free to delete this directory and its contents after the uninstall wizard has completed successfully.
+Please note that after uninstalling the application you may still have a `Werkr Server` directory in the install location.
+
+This directory should only contain leftover log files that were generated by the application during its operation.
+You can feel free to delete this directory and its contents after the uninstall wizard has completed successfully.
diff --git a/docs/articles/HowTo/index.md b/docs/articles/HowTo/index.md
index 06d395a..18bc028 100644
--- a/docs/articles/HowTo/index.md
+++ b/docs/articles/HowTo/index.md
@@ -1,2 +1,2 @@
-Welcome and thank you for your interest in the Werkr Project.
-Please review the How-To articles before opening any [issues](https://github.com/DarkgreyDevelopment/Werkr.App/issues/new/choose).
\ No newline at end of file
+Welcome and thank you for your interest in the Werkr Project.
+Please review the How-To articles before opening any [issues](https://github.com/DarkgreyDevelopment/Werkr.App/issues/new/choose).
diff --git a/docs/articles/Pre-Edit-High-Level-Design-Flow.md b/docs/articles/Pre-Edit-High-Level-Design-Flow.md
deleted file mode 100644
index ac5ddfc..0000000
--- a/docs/articles/Pre-Edit-High-Level-Design-Flow.md
+++ /dev/null
@@ -1,93 +0,0 @@
-# High-Level Development Phases:
-1. Foundations
-2. Core Features
-3. Advanced Features
-4. Security and Compliance
-5. Community and Documentation
-6. Extensibility and Optimization
-7. Nice-to-Haves
-
-## 1. Foundations
-1. Set up and manage GitHub repository for the project
-2. Implement server and agent applications for Windows 10+ and Debian Linux (systemd)
-3. Create MSI and .deb installers for Windows and Debian Linux releases
-4. Develop portable editions for Windows and Debian Linux
-5. Implement x64 and arm64 CPU architectures support
-6. Plan MacOS support for post-.NET 8 release
-7. Implement C# Kestrel webserver with external or built-in SQLite database
-8. Design database schema and data access layer for tasks and workflows
-9. Develop agent component for remote task execution
-10. Ensure MIT license adherence for all code
-
-## 2. Core Features
-1. Design task management UI: Task creation, scheduling, and execution
-2. Develop solo scheduled task configuration: Start dates, running time, and end times
-3. Implement workflow management UI: Task linking and DAG visualization
-4. Create DAG visualization UI for workflow and system views
-5. Implement ad-hoc task creation and execution interface
-6. Develop workflow editing mode: Task linking and input/output controls
-7. Design UI for task scheduling and triggers
-8. Implement scheduler component for various scheduling options
-9. Develop file watch task trigger: Poll and filesystem event
-10. Implement DateTime task trigger component
-11. Create interval/cyclical task trigger component
-12. Develop task completion state trigger mechanism
-13. Implement workflow completion state trigger mechanism
-14. Design library of system-defined tasks
-15. Create UI for building user-defined tasks
-16. Implement PowerShell script execution task support
-17. Develop PowerShell command execution task support
-18. Implement system shell command execution task support
-19. Design task output handling: PowerShell and system shell
-
-## 3. Advanced Features
-1. Implement branching, iteration, and exception handling options for workflows
-2. Develop performance monitoring and optimization features for tasks and workflows
-3. Implement task prioritization and resource allocation mechanisms
-4. Create UI for task progress tracking and monitoring
-5. Implement version control and change management for workflows and tasks
-6. Design and develop task retry and failure recovery mechanisms
-7. Implement workflow import/export and sharing functionality
-8. Develop task dependencies and precondition support
-9. Implement task tags and labels for improved organization and searchability
-10. Design and implement task templates for common use cases
-11. Create workflow templates for common use cases
-
-## 4. Security and Compliance
-1. Implement role-based access control and authentication
-2. Develop TOTP-based two-factor authentication system
-3. Implement TLS certificate generation and configuration
-4. Design data protection measures: Encryption, storage, and retention policies
-5. Ensure compliance with data protection regulations
-
-## 5. Community and Documentation
-1. Develop issue templates and processes for bug reports and feature requests
-2. Create documentation, tutorials, and troubleshooting resources
-3. Encourage and manage community contributions and collaboration
-4. Set up multiple repositories for focused contributions
-5. Develop user onboarding and interactive walkthroughs for new users
-6. Implement audit logs and activity tracking for tasks and workflows
-7. Create localization and internationalization support for the user interface
-
-## 6. Extensibility and Optimization
-1. Implement plugin system for extensibility
-2. Develop comprehensive library of built-in tasks
-3. Design and implement API for third-party integrations and extensions
-4. Implement support for distributed execution and load balancing across multiple agents
-5. Design and develop real-time notifications and alerts for critical task events
-6. Create UI for managing and monitoring agent health and status
-7. Develop RESTful API documentation for external integrations
-8. Implement task time estimation and time tracking features
-9. Develop system health and performance dashboards for administrators
-
-## 7. Nice-to-Haves
-1. Design user-friendly and accessible UI for various technical expertise levels
-2. Create UI for task execution history and analytics
-3. Implement support for task cloning and bulk editing
-4. Create UI for managing user profiles and access control settings
-5. Implement single sign-on (SSO) support for enterprise environments
-6. Develop integration with popular cloud storage providers for task data storage
-7. Design and develop mobile app for task monitoring and management on the go
-8. Implement user feedback mechanism and feature voting system
-9. Create UI for customizing look and feel of the application
-10. Implement support for accessibility features and assistive technologies
diff --git a/docs/articles/SecurityOverview.md b/docs/articles/SecurityOverview.md
new file mode 100644
index 0000000..c3dac8f
--- /dev/null
+++ b/docs/articles/SecurityOverview.md
@@ -0,0 +1,551 @@
+# Security Architecture
+
+This document describes the security architecture of the Werkr platform — the cryptographic primitives, authentication and authorization schemes, secret storage strategy, data protection controls, and agent-side security boundaries. For vulnerability reporting procedures, see [SECURITY.md](../SECURITY.md). For the overall system topology, see [Architecture](../Architecture.md).
+
+---
+
+## Cryptographic Primitives
+
+All cryptographic operations are implemented in `EncryptionProvider` using the .NET `System.Security.Cryptography` APIs. No third-party cryptography libraries are used.
+
+### Asymmetric Encryption (RSA)
+
+| Parameter | Value |
+|-----------|-------|
+| Key size | 4096-bit (minimum enforced: 2048) |
+| Encryption padding | OAEP with SHA-512 |
+| Signature padding | PKCS#1 v1.5 with SHA-512 |
+
+RSA keys are generated per component during registration. The public key is serialized as a JSON object containing `Modulus` and `Exponent` fields via `System.Text.Json`.
+
+### Symmetric Encryption (AES-256-GCM)
+
+| Parameter | Value |
+|-----------|-------|
+| Key length | 32 bytes (256-bit) |
+| Nonce length | 12 bytes |
+| Authentication tag | 16 bytes |
+
+AES-256-GCM provides authenticated encryption — the tag ensures both confidentiality and integrity in a single pass.
+
+### Hybrid Encryption
+
+Hybrid encryption combines both schemes for encrypting arbitrary-length data to a public key:
+
+1. Generate a random 32-byte AES key.
+2. Encrypt the plaintext with AES-256-GCM.
+3. Encrypt the AES key with the recipient's RSA public key (OAEP-SHA512).
+
+**Wire format:**
+
+```
+┌────────────────────┬──────────┬──────────┬────────────┐
+│ RSA-encrypted key │ Nonce │ Tag │ Ciphertext │
+│ (512 bytes) │ (12 B) │ (16 B) │ (var) │
+└────────────────────┴──────────┴──────────┴────────────┘
+```
+
+Decryption reverses the process: RSA-decrypt the first 512 bytes to recover the AES key, then AES-GCM-decrypt the remainder.
+
+### Password-Based Encryption
+
+Used during registration bundle exchange, where no public key has been exchanged yet:
+
+1. Derive a 32-byte key from the password by taking the first 32 bytes of its SHA-512 hash.
+2. Encrypt with AES-256-GCM.
+
+**Wire format:**
+
+```
+┌──────────┬──────────┬────────────┐
+│ Nonce │ Tag │ Ciphertext │
+│ (12 B) │ (16 B) │ (var) │
+└──────────┴──────────┴────────────┘
+```
+
+### Hashing
+
+| Purpose | Algorithm |
+|---------|-----------|
+| Data integrity / general hashing | SHA-512 |
+| Key fingerprints | SHA-512 |
+| API key storage | SHA-512 |
+| Token comparison | Constant-time via `CryptographicOperations.FixedTimeEquals` |
+
+---
+
+## Agent Registration
+
+Registration establishes a trust relationship between the API and a new Agent. The flow uses a password-protected bundle exchanged out-of-band so that no unencrypted secrets traverse the network.
+
+### Bundle Creation (API Side)
+
+1. The admin triggers registration in the Server UI, which calls the API.
+2. The API generates:
+ - A 16-byte random correlation token (`BundleId`).
+ - The API's RSA public key bytes.
+ - A `RegistrationBundlePayload` containing the `BundleId`, `ConnectionName`, `ServerUrl`, and `ServerPublicKeyBytes`.
+3. The payload is encrypted with the admin-supplied password (password-based AES-256-GCM) and Base64-encoded.
+4. The resulting bundle string is displayed to the admin for transfer.
+5. The bundle record is stored with a `Pending` status and a configurable expiration window (up to 24 hours).
+
+### Bundle Processing (Agent Side)
+
+1. The admin pastes the bundle string and password into the Agent's localhost-only `/register` endpoint.
+2. The Agent decrypts the bundle using the password, recovering the `BundleId`, `ServerUrl`, and the Server's RSA public key.
+3. The Agent generates its own RSA 4096-bit keypair.
+4. The Agent hybrid-encrypts its own public key with the Server's public key (from the bundle).
+5. The Agent calls the API's `RegisterAgent` gRPC endpoint. All registration fields (agent URL, name, bundle ID, public key) are protected in a single encrypted envelope. A non-secret hash-based lookup prevents leaking registration data during the correlation step.
+6. The API looks up the bundle by `BundleId`, verifies it is `Pending` and not expired, then hybrid-decrypts the Agent's public key using the Server's private key (stored with the bundle record).
+7. The API generates:
+ - Two 64-byte random API keys (Agent-to-API and API-to-Agent), encoded as 128-character hex strings.
+ - A 32-byte `SharedKey` for envelope encryption.
+8. The API stores the Agent-to-API key as a SHA-512 hash (never plaintext) and the API-to-Agent key in plaintext for outbound use.
+9. The API returns a `RegistrationResponsePayload` (containing both API keys, the `SharedKey`, and a `ConnectionId`), hybrid-encrypted to the Agent's public key.
+10. The Agent hybrid-decrypts the response with its own private key and persists the connection record to its local database.
+
+### Bundle Expiration
+
+A background service (`BundleExpirationService`) runs on a 1-hour interval, transitioning any `Pending` bundles past their `ExpiresAt` timestamp to `Expired` status.
+
+### Platform Validation
+
+On startup, the Agent performs a fail-fast RSA OAEP-SHA512 round-trip test to verify that the platform's cryptographic provider supports the required algorithms.
+
+### Agent Management
+
+- Each agent receives a unique, system-generated tag (`agent:{agent-id}`) at registration time. This tag is non-editable and non-deletable, enabling precise single-agent targeting.
+- Agent heartbeat interval: 30 seconds (configurable). An agent is considered offline after 3 consecutive missed heartbeats (90 seconds, configurable).
+- Agents report their capabilities (supported task types, installed action handlers, OS platform, architecture, agent version) during registration and via periodic heartbeat. The API validates capabilities before dispatching work.
+- Agents below the minimum compatible version are rejected at registration with a descriptive error.
+- Deregistration revokes the agent's keys and cleans up references. All registration and deregistration events are audit-logged.
+
+---
+
+## Encrypted Envelope (gRPC Payload Encryption)
+
+After registration, every gRPC payload between the API and Agent is wrapped in an `EncryptedEnvelope` protobuf message. This provides application-layer encryption on top of TLS. The envelope supports arbitrary inner payload types, enabling new gRPC services to use the same encryption without modifying the envelope contract.
+
+### Envelope Structure
+
+```protobuf
+message EncryptedEnvelope {
+ bytes ciphertext = 1; // AES-256-GCM encrypted payload
+ bytes iv = 2; // 12-byte nonce
+ bytes auth_tag = 3; // 16-byte GCM authentication tag
+ string key_id = 4; // Identifies which shared key was used
+}
+```
+
+### Encrypt / Decrypt Flow
+
+**Encrypt:** Serialize the inner protobuf message to bytes, encrypt with AES-256-GCM using the shared key, and populate the envelope fields.
+
+**Decrypt:** Read the `key_id` to select the correct shared key, AES-GCM-decrypt the `ciphertext` using the `iv` and `auth_tag`, then deserialize the inner protobuf.
+
+### Key Rotation
+
+The API can rotate the shared key via the `RotateSharedKey` RPC. During the configurable grace period (default: 5 minutes), both the current and previous keys are valid to avoid disrupting in-flight messages:
+
+1. The decryptor checks the envelope's `key_id` against the **current** key first.
+2. If the `key_id` does not match, it falls back to the **previous** key.
+3. If neither matches, it attempts decryption with the current key (handles envelopes sent before key IDs were introduced).
+
+After the grace period, the previous key is invalidated. Key rotation events are audit-logged.
+
+### Key Rotation Failure Modes
+
+- **Unreachable agent** — if an agent is unreachable during key rotation, the API retains the current key and retries rotation on the next successful heartbeat.
+- **Expired key** — if an agent presents an expired key after the grace period, the API rejects the request and the agent must re-register.
+- **Envelope version mismatch** — if an agent uses an older envelope format, the request is rejected with a descriptive error; the agent logs the failure and attempts reconnection with the current envelope version.
+- All key rotation failures are audit-logged.
+
+---
+
+## Secret Storage
+
+Each platform uses its native credential storage to protect sensitive material (database passphrases, keys).
+
+| Platform | Backend | Storage Location |
+|----------|---------|-----------------|
+| Windows | DPAPI (CurrentUser scope) | `%LOCALAPPDATA%\Werkr\secrets\{key}.bin` |
+| Linux | Protected file (owner-only read, mode 0600) | `/etc/werkr/keys/` |
+| macOS | Keychain (`security` CLI) | Service name: `Werkr` |
+
+`SecretStoreFactory` selects the correct implementation at runtime based on `RuntimeInformation.IsOSPlatform`.
+
+### Agent Database Passphrase
+
+The Agent's local database defaults to an encrypted SQLite database (PostgreSQL is also supported). The 32-byte hex passphrase is generated on first run and stored in the platform secret store under the key `werkr-agent-db`.
+
+---
+
+## Database Encryption at Rest
+
+Transparent column-level AES-256-GCM encryption protects sensitive data stored in the application database:
+
+- **Encrypted fields** — credentials, workflow variable values, connection strings, and API key hashes.
+- **Key management** — platform-appropriate key storage (DPAPI on Windows, Keychain on macOS, protected file on Linux), consistent with the secret storage model described above.
+- **Key rotation** — zero-downtime re-encryption: a new key is introduced, data is re-encrypted in background batches, and the old key is retired after all records are migrated.
+- **Migration tool** — a separate tool is provided for encrypting existing unencrypted data on upgrade from pre-encryption versions.
+
+---
+
+## Authentication
+
+Werkr uses multiple authentication schemes depending on the caller and context.
+
+### JWT Bearer Tokens
+
+| Parameter | Value |
+|-----------|-------|
+| Algorithm | HMAC-SHA256 |
+| Minimum signing key length | 32 characters |
+| Default token lifetime | 15 minutes (configurable) |
+| Clock skew tolerance | 1 minute |
+| Issuer | `werkr-api` |
+| Audience | `werkr` |
+
+Token claims include: `NameIdentifier`, `Role`, `Jti` (unique token ID), `ApiKeyId`, `ApiKeyName`, and one claim per granted permission.
+
+JWT validation is configured centrally in `JwtValidationConfigurator` and shared by all components that need to validate tokens. JWTs are used for browser-session-originated requests forwarded by the Server. Werkr.Server manages token renewal for browser sessions via sliding expiration.
+
+### Cookie Authentication
+
+Used for interactive browser sessions on the Server.
+
+| Parameter | Value |
+|-----------|-------|
+| Sliding expiration | 30 minutes |
+| Cookie flags | `HttpOnly`, `SameSite=Strict`, `SecurePolicy=SameAsRequest` |
+
+The cookie authentication handler enforces additional checks:
+
+- Rejects requests from disabled user accounts.
+- Redirects users who must change their password.
+- Redirects users who have not completed 2FA enrollment when required.
+
+### Passkey Support (WebAuthn/FIDO2)
+
+WebAuthn/FIDO2 passkeys are supported as both a primary authentication method (passwordless) and as an optional second-factor method:
+
+- Users can register one or more passkeys alongside or instead of TOTP.
+- Passkey authentication satisfies the platform's 2FA requirement when used as the primary method (the passkey itself provides multi-factor assurance via the authenticator's user verification).
+- Passkey registration, authentication, and removal events are audit-logged.
+
+### Login Rate Limiting
+
+Per-IP rate limits are enforced on authentication endpoints to mitigate credential stuffing and brute-force attacks. This operates independently of per-account lockout (see Password Policy below) — both mechanisms are evaluated, and the most restrictive applies.
+
+---
+
+## API Keys
+
+API keys provide non-interactive, programmatic access for CI/CD pipelines, external integrations, and automation.
+
+### Key Format and Storage
+
+| Parameter | Detail |
+|-----------|--------|
+| Format | `wk_` prefix + 32 random bytes (base64url-encoded) |
+| Storage | SHA-512 hash of the key value (plaintext never stored) |
+| Display | Keys are displayed once at creation and cannot be retrieved afterward |
+
+### Key Lifecycle
+
+- **Create** — via the UI and REST API. At creation, the user selects which of their permissions the key carries; all permissions are selected by default.
+- **Revoke** — immediate invalidation of a key.
+- **Rotate** — create a new key and revoke the old one in a single operation.
+- **Expiration** — configurable expiration dates. `LastUsedUtc` timestamp tracking.
+
+### Permission Scoping
+
+- Keys cannot exceed the creator's current permissions at creation time.
+- If the creator's permissions are subsequently **reduced** (role demotion or permission removal), all active API keys for that user are fully revoked. The user must create new API keys after their permissions change.
+- Permission **additions** to the creator's role do not retroactively expand existing keys or require key recreation.
+
+### Rate Limiting
+
+- Per-key rate limits are configurable.
+- There are no concurrency limits on simultaneous use of the same API key from multiple clients.
+- API trigger rate limits apply independently of API key rate limits; both limits are evaluated and the most restrictive applies.
+
+### Audit Logging
+
+All key creation, revocation, rotation, and usage events are recorded in the audit log.
+
+---
+
+## Auth Forwarding & Service Identity
+
+### User-Scoped API Forwarding
+
+API calls originating from the UI carry the authenticated user's identity, role, and permissions. UI actions are authorized at the user's permission level, not an elevated service account. This ensures that a user cannot perform actions through the UI that exceed their granted permissions. Background server-initiated operations (health monitoring) use a separate administrative channel.
+
+### System Service Identity
+
+Trigger-initiated workflow execution (schedule, file monitor, workflow completion, API triggers) uses a system service identity. Trigger configuration requires elevated permissions, which gates what workflows can be auto-triggered. The system service identity is distinct from any user account and is used solely for automated operations.
+
+---
+
+## Authorization (RBAC)
+
+Authorization is permission-based, enforced via ASP.NET policy authorization. Every API endpoint and UI page is protected by permission-based policies rather than fixed role checks.
+
+### Permission Model
+
+Permissions use a hierarchical `resource:action` naming convention (e.g., `workflows:execute`, `agents:manage`, `settings:write`, `views:create`, `views:share`). Permissions are organized under their owning domain namespace. All registered permissions appear in the role management UI.
+
+Permissions are registered at application startup. The permission model supports additive evolution — new permissions can be introduced without modifying existing permission definitions.
+
+### Custom Roles
+
+Administrators create custom roles and assign fine-grained permissions. The role management UI provides a matrix interface for permission assignment and user-to-role mapping.
+
+### Built-in Roles
+
+Three non-deletable default roles ship with predefined permission sets:
+
+| Role | Permissions |
+|------|-------------|
+| **Admin** | All permissions |
+| **Operator** | Create, read, update, execute operations |
+| **Viewer** | Read-only access |
+
+### Per-Workflow Execution Permissions
+
+Roles may be granted or denied execution permission on specific workflows, providing granular control over who can trigger which automations.
+
+### Policies
+
+Policy-based authorization is enforced on both REST API endpoints and Blazor pages via `[Authorize]` attributes. Policies reference the hierarchical permission model rather than fixed role names.
+
+### Default Admin Account
+
+On first startup, the identity seeder creates a default admin account:
+
+| Property | Value |
+|----------|-------|
+| Email | `admin@werkr.local` |
+| Password | Random 24 characters (guaranteed uppercase, lowercase, digit, symbol; Fisher-Yates shuffle) |
+| `ChangePassword` | `true` (forced change on first login) |
+| `Requires2FA` | `true` (TOTP enrollment required) |
+
+The generated password is logged once at startup and never persisted.
+
+---
+
+## User Management
+
+### User Lifecycle
+
+- **Invitation** — administrators create user accounts with initial role assignments.
+- **Deactivation** — suspend a user without deleting their account or audit history. Deactivated users cannot authenticate.
+- **Password reset** — self-service forgot-password flow via email.
+
+### Session Management
+
+- Administrators can view and revoke active user sessions.
+- Revoked sessions are invalidated immediately; the affected user is required to re-authenticate.
+- Default maximum session count per user: 5. When exceeded, the oldest session is automatically revoked.
+
+### Audit Logging
+
+User lifecycle events (creation, deactivation, role changes) and session events (login, logout, revocation) are audit-logged.
+
+---
+
+## Password Policy
+
+Aligned with NIST SP 800-63B §5.1.1.2:
+
+- **Minimum length** — 12 characters.
+- **No character-class complexity requirements** — no mandatory uppercase, lowercase, digit, or symbol rules.
+- **Password history** — enforcement of 5 previous passwords (configurable). Users cannot reuse recent passwords.
+- **Account lockout** — 15 minutes after 5 failed attempts.
+
+---
+
+## Two-Factor Authentication
+
+### TOTP (Time-Based One-Time Passwords)
+
+Werkr supports TOTP for user accounts. MFA enrollment is enforced for the default admin account and can be required for any user or role by an administrator.
+
+- **Enrollment** — users scan a QR code or enter the shared secret manually in an authenticator app.
+- **Verification** — a 6-digit TOTP code is required at login when 2FA is enabled.
+- **Recovery codes** — 10 single-use recovery codes are generated during enrollment for account recovery. Users may regenerate codes at any time, which invalidates all previous codes.
+- **Admin enforcement** — administrators can require 2FA enrollment for all users or specific roles. The cookie handler redirects unenrolled users to the setup page.
+
+### Passkeys
+
+WebAuthn/FIDO2 passkeys serve as an alternative or complement to TOTP. See the Authentication section above for details.
+
+---
+
+## gRPC Agent Authentication
+
+Agent-to-API and API-to-Agent gRPC calls are authenticated using the API keys established during registration.
+
+### Verification Flow
+
+1. The caller attaches the raw API key as a bearer token and its `ConnectionId` as the `x-werkr-connection-id` metadata header.
+2. The interceptor (`AgentBearerTokenInterceptor` on the API, `BearerTokenInterceptor` on the Agent) looks up the `RegisteredConnection` by `ConnectionId`.
+3. The interceptor computes the SHA-512 hash of the presented token and compares it to the stored hash using `CryptographicOperations.FixedTimeEquals` (constant-time comparison to prevent timing attacks).
+4. On success, the `RegisteredConnection` is stored in `context.UserState` for downstream service methods.
+5. `LastSeen` is updated with a 60-second debounce threshold to avoid database writes on every call.
+
+### Distinction Between Sides
+
+An `IsServer` flag on the `RegisteredConnection` record distinguishes the API-held record (where `IsServer = true` and the Agent-to-API key hash is stored) from the Agent-held record (where `IsServer = false` and the API-to-Agent key hash is stored).
+
+### Agent Offline Mid-Job
+
+When an agent becomes unreachable mid-job, the API considers the agent's in-flight jobs as still running. Jobs transition to failed when the first of the following thresholds is exceeded:
+
+1. Task maximum run duration.
+2. Agent heartbeat timeout (3 consecutive missed heartbeats).
+3. Workflow-level timeout.
+
+---
+
+## Path Allowlisting (Agent)
+
+The Agent validates all file and directory paths against a configurable allowlist before executing any file-system operation.
+
+### Configuration
+
+Path allowlists are configured per-agent through the agent settings UI. Each agent's allowlist defines which filesystem paths the agent is permitted to access during task execution. The default posture is **deny-all** — agents with an empty allowlist cannot access any filesystem paths.
+
+The allowlist supports standard glob patterns (`*` for multiple character wildcards and `?` for single character wildcards).
+
+Allowlist changes are audit-logged and distributed to the agent via the encrypted gRPC configuration synchronization channel.
+
+### Validation Rules
+
+1. **Path normalization:** `Path.GetFullPath` resolves relative segments, followed by platform-specific steps:
+ - **Windows:** 8.3 short-path expansion via `GetLongPathNameW` (P/Invoke), then symlink resolution and separator normalization.
+ - **All platforms:** Reject paths containing `..` traversal after normalization.
+2. **Dangerous path rejection (Windows):** Paths starting with `\\?\`, `\\.\`, or UNC `\\` prefixes are rejected, as are paths containing NTFS Alternate Data Streams (`:` after root).
+3. **Prefix matching:** The normalized path must start with at least one entry in the configured paths list. Comparison is ordinal-ignore-case on Windows, ordinal on Linux/macOS.
+4. **Glob resolution:** When file patterns (wildcards) are used, each resolved file is validated individually. Symlinks that resolve outside the allowed paths are rejected (prevents symlink-through-glob attacks). Source and destination paths are checked to be distinct.
+
+---
+
+## Outbound Request Controls
+
+The HTTP Request, Send Webhook, and File Download/Upload action handlers validate target URLs against configurable security controls.
+
+### URL Allowlisting
+
+Requests to URLs not on the configured allowlist are rejected. The allowlist is configured at the platform level.
+
+### Private Network Protection
+
+Requests to private and internal IP ranges (RFC 1918, link-local, loopback) are blocked by default. An explicit override is required to permit internal network targets. This prevents server-side request forgery (SSRF) attacks against internal infrastructure.
+
+### DNS Rebinding Protection
+
+Resolved IP addresses are validated against the allowlist **after** DNS resolution. This prevents DNS rebinding attacks where a domain initially resolves to an allowed IP and then re-resolves to a private address during the request.
+
+---
+
+## File Monitoring Security
+
+File monitor triggers (persistent triggers that watch directories for file events) enforce the following security controls:
+
+- **Path validation** — monitored paths must fall within the agent's configured path allowlist.
+- **Canonical path resolution** — prevents symbolic link and directory traversal attacks.
+- **Debounce** — configurable debounce window (default: 500 ms) prevents trigger flooding from rapid file system events.
+- **Circuit breaker** — excessive trigger rates trip a circuit breaker to prevent resource exhaustion.
+- **Watch limit** — configurable maximum watch count per agent (default: 50) prevents resource exhaustion from excessive file monitors.
+- **Elevated permissions** — trigger configuration requires elevated permissions. All trigger configuration changes are audit-logged.
+
+---
+
+## API Trigger Security
+
+API triggers (REST endpoints that initiate workflow runs) enforce the following security controls:
+
+- **Authentication** — API triggers require authentication via API key or bearer token. The workflow ID is specified in the request body or URL parameter.
+- **Rate limiting** — configurable per-workflow rate limits apply independently of API key rate limits. Both limits are evaluated and the most restrictive applies. Rate-limited callers receive an HTTP 429 response with a `Retry-After` header indicating when the next request will be accepted.
+- **Request validation** — optional JSON schema validation for trigger payloads.
+- **Payload injection** — validated trigger payloads are injected as workflow input variables.
+- **Cycle detection** — the trigger registry detects circular workflow-completion chains at configuration time and surfaces a prominent warning in the workflow list and workflow editor UI. Circular chains are not blocked (users may intentionally create cyclical workflows). Workflow-completion trigger chains have a configurable maximum chain depth (default: 5). Each trigger-initiated run carries a chain depth counter; when max depth is reached, the trigger is suppressed with an audit log entry. Manual triggers reset the counter to 0.
+
+---
+
+## Transport Security
+
+| Concern | Configuration |
+|---------|--------------|
+| HTTP | Kestrel configured for `Http1AndHttp2`; HTTPS redirect and HSTS enabled in production |
+| gRPC | HTTP/2 over TLS; ALPN negotiation selects the protocol automatically |
+| Agent keepalive | gRPC ping interval: 30-second delay, 10-second timeout |
+| TLS errors | Mapped to `CommandDispatchFailure.TlsError` for structured error handling |
+
+### TLS Enforcement
+
+All connections (browser to Server, Server to API, API to Agent) require HTTPS/TLS. URL scheme validation is enforced at registration, notification channel creation, and gRPC channel construction. HTTP URLs are explicitly rejected.
+
+### Data Protection
+
+ASP.NET Data Protection keys are scoped to the application name `Werkr` and persisted to a `keys/` directory on disk via `PersistKeysToFileSystem`.
+
+---
+
+## Content Security Policy
+
+The Blazor Server UI enforces Content Security Policy (CSP) headers with directives appropriate for:
+
+- Blazor Server rendering (inline scripts and styles required by the framework).
+- SignalR WebSocket connections for real-time updates.
+- JavaScript interop for the DAG editor graph library.
+
+CSP directives are configured to balance security (preventing XSS and data injection) with the functional requirements of the Blazor Server architecture.
+
+---
+
+## Sensitive Data Redaction
+
+Configurable mechanisms prevent sensitive data from appearing in execution logs, output previews, and real-time log streaming.
+
+### Variable-Level Redaction
+
+Workflow variables can be flagged as "redact from logs." Variables with this flag have their resolved values automatically replaced with `[REDACTED]` in all execution output. Variable-level redaction is applied first, before regex-based pattern matching.
+
+### Regex-Based Redaction
+
+Configurable regex patterns automatically mask sensitive data (passwords, tokens, connection strings, API keys) in execution log output:
+
+- Default redaction patterns ship with the platform. Administrators can add custom patterns.
+- Custom regex patterns are validated at save time. Patterns that fail compilation or exceed a complexity threshold (default: 1 second compilation time) are rejected.
+- Redacted values are replaced with a consistent `[REDACTED]` marker.
+
+### Redaction Order
+
+1. Variable-level redaction flags are applied first.
+2. Regex-based patterns are applied afterward to catch any remaining sensitive values not covered by explicit flags.
+
+Redaction applies to stored output, output previews, and real-time log streaming.
+
+---
+
+## Variable Escaping
+
+Workflow variables are escaped or encoded appropriately for the receiving execution context before interpolation to prevent injection attacks:
+
+- **Shell commands** — variables are escaped according to the target shell's quoting rules (e.g., `cmd.exe` on Windows, `/bin/sh` on Linux/macOS).
+- **Action handler parameters** — values are encoded appropriately for the target context (e.g., file paths, HTTP headers).
+- **Discrete argument passing** — where the execution model supports it (PowerShell parameters, process arguments), variables are passed as discrete arguments rather than interpolated into command strings, eliminating injection risk entirely.
+
+---
+
+## Compliance Alignment
+
+The security architecture aligns with:
+
+- **OWASP Top 10** — mitigations for injection, broken authentication, sensitive data exposure, XML external entities, broken access control, security misconfiguration, XSS, insecure deserialization, insufficient logging and monitoring, and SSRF.
+- **NIST SP 800-63B** — authentication guidelines including password policy (§5.1.1.2), multi-factor authentication, and session management.
+
+Specific compliance mapping is maintained in the security documentation.
diff --git a/docs/articles/Testing.md b/docs/articles/Testing.md
index 4e453c3..89b8331 100644
--- a/docs/articles/Testing.md
+++ b/docs/articles/Testing.md
@@ -1,11 +1,181 @@
-# Testing
-Testing will be performed via github actions using [github-hosted-runners](https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners).
-
-Testing should be done automatically during the pull request process. All tests must pass prior to pull request approval.
-
-
-## Operating Systems:
-
-Windows is tested on Windows Server 2022.
-
-Linux is tested on Ubuntu latest.
\ No newline at end of file
+# Testing
+
+Werkr uses [MSTest](https://learn.microsoft.com/en-us/dotnet/core/testing/unit-testing-mstest-intro) with the `Microsoft.Testing.Platform` runner (configured in `global.json`). The TypeScript graph-ui uses [Vitest](https://vitest.dev/). Tests run automatically in CI via GitHub Actions and must all pass before a pull request can be merged.
+
+---
+
+## Prerequisites
+
+| Requirement | Details |
+|-------------|---------|
+| **.NET 10 SDK** | Required for all .NET test projects. Pinned in `global.json`. |
+| **Docker** | Required for `Werkr.Tests` integration tests — Testcontainers spins up PostgreSQL 17 Alpine. Docker Desktop or Docker Engine must be running. |
+| **Node.js 22+** | Required for graph-ui TypeScript tests. See the `engines` field in `src/Werkr.Server/graph-ui/package.json`. |
+| **npm** | Comes with Node.js. Used for `npm ci` (dependency install) and `npm test` (Vitest runner). |
+| **PowerShell 7+** | Required for `Werkr.Tests.Agent` tests that exercise the embedded PowerShell host. |
+
+---
+
+## Test Projects
+
+| Project | Scope | Key Patterns |
+|---------|-------|--------------|
+| `src/Test/Werkr.Tests/` | API integration tests. Covers schedules, workflows, actions, holiday calendars, and action dispatch end-to-end. | `AppHostFixture`, `WebApplicationFactory`, Testcontainers (PostgreSQL 17 Alpine) |
+| `src/Test/Werkr.Tests.Data/` | Unit tests for data layer — entity validation, EF Core query behavior, collection utilities, range types, cryptography, scheduling logic, workflow services, registration. | In-memory EF Core provider, no external dependencies |
+| `src/Test/Werkr.Tests.Server/` | Blazor component tests and Server integration tests — identity flows (seeding, JWT, cookies, API keys, permissions, user management), authorization, page rendering, action parameter editors. | bunit (`BunitContext` base class), in-memory identity stores |
+| `src/Test/Werkr.Tests.Agent/` | Agent tests — action handlers (27+ handlers), operator execution (PowerShell, shell), output streaming, scheduling, security (path allowlist, URL validation). | Test doubles (`SuccessHandler`, `FailHandler`, `SlowHandler`), mock gRPC contexts |
+| `src/Werkr.Server/graph-ui/` | TypeScript frontend tests — DAG changeset logic, cycle detection, draft storage, clipboard handling, timeline styles, timeline item construction. | Vitest, direct module imports (no browser DOM) |
+
+---
+
+## AppHostFixture Pattern
+
+The `Werkr.Tests` project uses a shared `AppHostFixture` (assembly-level setup/teardown via `[AssemblyInitialize]` / `[AssemblyCleanup]`) that:
+
+1. **Starts a disposable PostgreSQL container** via [Testcontainers](https://dotnet.testcontainers.org/) (`postgres:17-alpine`).
+2. **Creates an in-process API server** via `WebApplicationFactory`, replacing the database registrations with the Testcontainer's connection string via `ConfigureServices`.
+3. **Runs EF Core migrations** for both the application database (`WerkrDbContext`) and the identity database (`WerkrIdentityDbContext`).
+4. **Seeds identity roles and permissions** — replicates the minimal role/permission seed so permission-based auth resolves correctly in tests.
+5. **Generates an authenticated HTTP client** with a JWT admin token for making authorized API calls.
+
+All integration test classes in `Werkr.Tests` use `AppHostFixture.ApiClient` and `AppHostFixture.JsonOptions` to interact with the API.
+
+Test parallelization is disabled (`[assembly: DoNotParallelize]` in `AssemblyAttributes.cs`) because all tests share a single Testcontainer database instance.
+
+See `src/Test/Werkr.Tests/AppHostFixture.cs` for the implementation.
+
+---
+
+## Blazor Component Testing (bunit)
+
+`Werkr.Tests.Server` uses [bunit](https://bunit.dev/) for Blazor component testing. Test classes extend `BunitContext` (from bunit for MSTest), rendering components in isolation with mock services registered in the test context.
+
+Current bunit test classes:
+- `ActionParameterEditorTests`
+- `ConditionBuilderTests`
+- `IntArrayEditorTests`
+- `KeyValueMapEditorTests`
+- `ObjectArrayEditorTests`
+- `StringArrayEditorTests`
+- `TaskSetupModalTests`
+
+The same project also contains non-bunit tests for identity services, authorization, and page-level logic that use standard MSTest patterns without bunit rendering.
+
+---
+
+## Graph-UI TypeScript Tests (Vitest)
+
+The graph-ui TypeScript frontend has its own test suite using Vitest 3.x.
+
+- **Location:** `src/Werkr.Server/graph-ui/`
+- **Test files:** `test/` directory, pattern `test/**/*.test.ts`
+- **Configuration:** `vitest.config.ts` at the graph-ui root
+- **Coverage:** V8 provider with 90% line threshold on `changeset.ts`, `cycle-detection.ts`, `draft-storage.ts`
+- **Dependencies:** `@antv/x6` (DAG rendering), `dagre` (graph layout), `vis-data` + `vis-timeline` (timeline/Gantt rendering)
+
+Current test suites:
+- `test/dag/changeset.test.ts`
+- `test/dag/clipboard-handler.test.ts`
+- `test/dag/cycle-detection.test.ts`
+- `test/dag/draft-storage.test.ts`
+- `test/smoke.test.ts`
+- `test/timeline/timeline-items.test.ts`
+- `test/timeline/timeline-styles.test.ts`
+
+### Running graph-ui tests locally
+
+```shell
+cd src/Werkr.Server/graph-ui
+npm ci # Install dependencies (first time or after package-lock changes)
+npm test # Run tests once (CI mode)
+npx vitest # Run in watch mode (development)
+```
+
+Bundle size checks run in CI via `scripts/check-bundle-size.mjs` after the production build.
+
+---
+
+## Running Tests Locally
+
+### All .NET tests
+
+```shell
+dotnet test Werkr.slnx
+```
+
+> **Prerequisite:** Docker must be running for `Werkr.Tests` integration tests (Testcontainers).
+
+### Specific .NET test project
+
+```shell
+dotnet test --project src/Test/Werkr.Tests/Werkr.Tests.csproj
+```
+
+### Graph-UI tests
+
+```shell
+npm test --prefix src/Werkr.Server/graph-ui
+```
+
+> **Prerequisite:** Run `npm ci --prefix src/Werkr.Server/graph-ui` first to install dependencies.
+
+### Graph-UI watch mode
+
+```shell
+cd src/Werkr.Server/graph-ui && npx vitest
+```
+
+### VS Code Tasks
+
+VS Code tasks are available in `.vscode/tasks.json`:
+
+| Task Label | Test Project |
+|------------|-------------|
+| `verify:test-unit` | `Werkr.Tests.Data` (data layer unit tests) |
+| `verify:test-integration` | `Werkr.Tests.Server` (Server integration tests) |
+| `verify:test-server` | `Werkr.Tests.Server` (Server tests) |
+| `verify:test-api` | `Werkr.Tests` (API integration tests, requires Docker) |
+| `verify:test-e2e` | `Werkr.Tests.Agent` (Agent e2e tests) |
+| `verify:test-e2e-verbose` | `Werkr.Tests.Agent` (verbose output) |
+| `verify:test-e2e-failures` | `Werkr.Tests.Agent` (failures only) |
+| `verify:test-graphui` | graph-ui TypeScript tests (Vitest) |
+
+---
+
+## CI Pipeline
+
+The GitHub Actions pipeline (`.github/workflows/ci.yml`) runs on every push and PR to `main` and `develop`. Runs on `ubuntu-latest` with concurrency per-ref (`cancel-in-progress: true`).
+
+### Steps
+
+1. **Checkout** — Full history clone (`fetch-depth: 0`) for GitVersion.
+2. **Setup .NET 10** — Installs .NET 10 SDK.
+3. **Restore tools** — `dotnet tool restore` (GitVersion, etc.).
+4. **Determine version** — Runs GitVersion to derive `SemVer`, `AssemblySemVer`, `AssemblySemFileVer`, `InformationalVersion`.
+5. **Setup Node.js 22** — Installs Node.js 22 with npm cache keyed on `graph-ui/package-lock.json`.
+6. **Install graph-ui dependencies** — `npm ci --prefix src/Werkr.Server/graph-ui`.
+7. **Run JS tests** — `npm test --prefix src/Werkr.Server/graph-ui` (Vitest).
+8. **Build JS bundles** — `npm run build:prod --prefix src/Werkr.Server/graph-ui` (production esbuild).
+9. **Check bundle sizes** — `node src/Werkr.Server/graph-ui/scripts/check-bundle-size.mjs`.
+10. **Restore .NET dependencies** — `dotnet restore Werkr.slnx --force-evaluate` with lock file validation via `Test-LockFileChanges.ps1` (skips Windows-only Installer projects).
+11. **Build** — `dotnet build Werkr.slnx -c Release` with GitVersion-derived version properties.
+12. **Test** — `dotnet test --solution Werkr.slnx -c Release --no-build` with TRX logger.
+13. **Upload test results** — `.trx` files uploaded as build artifacts (runs even on failure via `if: always()`).
+
+---
+
+## Test Infrastructure Details
+
+| Technology | Used By | Purpose |
+|------------|---------|---------|
+| **MSTest 4.x** | All .NET test projects | Test framework, configured with `Microsoft.Testing.Platform` runner in `global.json` |
+| **Testcontainers** | `Werkr.Tests` | Disposable PostgreSQL 17 Alpine instances for integration tests; container lifecycle managed by `AppHostFixture` |
+| **bunit** | `Werkr.Tests.Server` | Blazor component rendering tests; test classes extend `BunitContext` |
+| **Vitest 3.x** | `graph-ui` | TypeScript unit tests with V8 coverage provider and line thresholds |
+| **In-memory EF Core** | `Werkr.Tests.Data` | Fast unit tests with no database dependency |
+
+### Test Parallelization
+
+- `Werkr.Tests` disables parallelization (`[assembly: DoNotParallelize]`) because all tests share a single Testcontainer database instance.
+- Other .NET test projects run in parallel by default.
+- graph-ui Vitest tests run in parallel by default.
diff --git a/docs/articles/toc.yml b/docs/articles/toc.yml
index a5b66e2..7a2adcd 100644
--- a/docs/articles/toc.yml
+++ b/docs/articles/toc.yml
@@ -1,8 +1,8 @@
- name: How-To Articles
href: HowTo/index.md
-- name: Project Features
- href: FeatureList.md
-- name: High Level Design Flow
- href: Pre-Edit-High-Level-Design-Flow.md
-- name: Testing
- href: Testing.md
\ No newline at end of file
+- name: Design Specification
+ href: ../1.0-Target-Featureset.md
+- name: Testing
+ href: Testing.md
+- name: Security Overview
+ href: SecurityOverview.md
diff --git a/docs/docfx/docfx.json b/docs/docfx/docfx.json
index 306eecf..501c02e 100644
--- a/docs/docfx/docfx.json
+++ b/docs/docfx/docfx.json
@@ -5,11 +5,15 @@
{
"src": "../src",
"files": [
- "Werkr.Agent/src/Werkr.Agent.csproj",
- "Werkr.Common/src/Werkr.Common.csproj",
- "Werkr.Common.Configuration/src/Werkr.Common.Configuration.csproj",
- "Werkr.Installers/src/Wix/CustomActions.csproj",
- "Werkr.Server/src/Werkr.Server.csproj"
+ "Werkr.Agent/Werkr.Agent.csproj",
+ "Werkr.Api/Werkr.Api.csproj",
+ "Werkr.Common/Werkr.Common.csproj",
+ "Werkr.Common.Configuration/Werkr.Common.Configuration.csproj",
+ "Werkr.Core/Werkr.Core.csproj",
+ "Werkr.Data/Werkr.Data.csproj",
+ "Werkr.Data.Identity/Werkr.Data.Identity.csproj",
+ "Werkr.Server/Werkr.Server.csproj",
+ "Installer/Msi/CustomActions/Werkr.Installer.Msi.CustomActions.csproj"
],
"exclude": [
"**/bin/**",
diff --git a/docs/docfx/templates/Werkr/fonts/CascadiaCodePL.ttf b/docs/docfx/templates/Werkr/fonts/CascadiaCodePL.ttf
deleted file mode 100644
index 8a7c949..0000000
Binary files a/docs/docfx/templates/Werkr/fonts/CascadiaCodePL.ttf and /dev/null differ
diff --git a/docs/docfx/templates/Werkr/fonts/glyphicons-halflings-regular.svg b/docs/docfx/templates/Werkr/fonts/glyphicons-halflings-regular.svg
index 94fb549..8376c0f 100644
--- a/docs/docfx/templates/Werkr/fonts/glyphicons-halflings-regular.svg
+++ b/docs/docfx/templates/Werkr/fonts/glyphicons-halflings-regular.svg
@@ -1,288 +1,288 @@
-
-
-