From 9aecbc52173935c74666c5abac8b5dded28e3841 Mon Sep 17 00:00:00 2001 From: Marcelo Henrique Neppel Date: Mon, 25 May 2026 16:58:32 -0300 Subject: [PATCH 1/5] docs: add AGENTS.md with comprehensive agent guidelines Signed-off-by: Marcelo Henrique Neppel --- AGENTS.md | 172 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 172 insertions(+) create mode 100644 AGENTS.md diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 00000000000..3e273dcc4af --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,172 @@ +# PostgreSQL Operator - Agent Guidelines + +Charmed PostgreSQL VM Operator — a Juju charm (Python/ops framework) deploying and managing +PostgreSQL 16 on virtual machines via Patroni for high availability. + +## Architectural Rules + +### Module Responsibilities + +- **`charm.py`** — Main operator class. Event handler registration, orchestration, and business + logic. Delegates domain-specific work to specialized modules. +- **`cluster.py`** — Patroni lifecycle management: start/stop, switchover, health checks, Raft + consensus, Patroni configuration rendering (`render_patroni_yml_file`). +- **`backups.py`** — pgBackRest integration: backup, restore, PITR, S3 credential management, + pgBackRest configuration rendering. +- **`config.py`** — Pydantic configuration model (`CharmConfig`) only. Pure schema definition + with validated fields. Does not render config files. +- **`relations/`** — One handler class per relation interface (5 files): + `postgresql_provider.py`, `async_replication.py`, `logical_replication.py`, `tls.py`, + `watcher.py`. +- **`cluster_topology_observer.py`** — Watches for cluster topology changes via a spawned + background process and emits custom charm events. +- **`ldap.py`** — LDAP integration via the `LdapRequirer` interface. +- **`locales.py`** — Literal type definition of all locales available in the snap. +- **`rotate_logs.py`** — Background process management for log rotation (pgBackRest logs). +- **`constants.py`** — Global shared constants (paths, ports, password keys). Domain-specific + constants (error messages, local state values) live in their respective modules. +- **`templates/`** — Jinja2 templates: `patroni.yml.j2`, `pgbackrest.conf.j2`, + `pgbackrest.logrotate.j2`. + +### External Package: `single_kernel_postgresql` + +The `single_kernel_postgresql` package is the shared library used across PostgreSQL charms on +all substrates (VM and K8s). It provides: + +- **`PostgreSQL` class** — SQL-level operations (user/role management, database creation, + extension management, parameter building). Never manages PostgreSQL lifecycle. +- **Config literals** — `SYSTEM_USERS`, `REPLICATION_USER`, `REWIND_USER`, `MONITORING_USER`, + `USER`, `BACKUP_USER`, `PEER`, `Substrates`, `POSTGRESQL_STORAGE_PERMISSIONS`. +- **Exception classes** — `PostgreSQLCreateUserError`, `PostgreSQLBaseError`, etc. +- **`TLSTransfer`** — TLS certificate transfer event handling. +- **Utility functions** — File rendering, password generation, HTTP helpers. + +### Key Rules + +1. **Never manage PostgreSQL directly** — all lifecycle operations (start, stop, restart, + reload) go through Patroni via the `Patroni` class in `cluster.py`. The `PostgreSQL` class + from `single_kernel_postgresql` is for SQL operations only. + +2. **Relation handler pattern** — handlers inherit from `ops.Object`, receive a charm reference + in `__init__`, and observe their own relation events internally. Do not observe relation + events in `charm.py`, except for the peer relation (`database-peers`) whose + `relation_changed` and `relation_departed` events are observed directly in `charm.py`. + +3. **Leader-only writes** — app-scoped relation data writes require a + `self.unit.is_leader()` guard. Unit-scoped data can be written by any unit. + +4. **Peer data access** — use `self.charm.app_peer_data` and `self.charm.unit_peer_data` dict + properties for reading/writing peer relation data. + +5. **Configuration flow** — `CharmConfig` (Pydantic model in `config.py`) validates charm + config. Config file rendering happens in `cluster.py` (Patroni YAML) and `backups.py` + (pgBackRest conf) using Jinja2 templates from `templates/`. + +6. **Constants placement** — global constants shared across modules go in `constants.py`. + Domain-specific constants (error messages, local state values) stay in the module that + uses them. + +7. **TYPE_CHECKING guard** — use `if TYPE_CHECKING:` for imports needed only by type checkers + (especially the charm class in relation handlers to avoid circular imports). + +8. **Snap-based workload** — PostgreSQL runs as a snap (`charmed-postgresql`). All paths are + under `/var/snap/charmed-postgresql/`. Service management uses the `charmlibs.snap` library. + +9. **Event deferral** — check preconditions before proceeding in event handlers: peer relation + exists (`self._peers is not None`), cluster initialized (`"cluster_initialised" in + self.app_peer_data`), Patroni started (`self._patroni.member_started`), can connect to + PostgreSQL. Defer the event if preconditions are not met. + +10. **Status setting** — use `self.set_unit_status()` instead of `self.unit.status =` directly. + This method respects the refresh lifecycle and will not override refresh status. + +11. **Rolling restarts** — use `RollingOpsManager` (bound to the `restart` peer relation) for + coordinated PostgreSQL restarts. Never restart Patroni/PostgreSQL directly without going + through the rolling ops mechanism. + +12. **Retry patterns** — transient operations (database connections, Patroni API calls) use + `tenacity` for retry logic. Use `Retrying` context manager or `@retry` decorator with + appropriate stop/wait strategies. Catch `RetryError` when all retries are exhausted. + +## Code Quality Rules + +### Copyright Header + +Every file must start with: + +```python +# Copyright YYYY Canonical Ltd. +# See LICENSE file for licensing details. +``` + +### Style + +- **Line length**: 99 characters +- **Python target**: 3.12 +- **Imports**: sorted via ruff I001 — stdlib, then third-party, then local. Absolute imports + preferred. +- **Docstrings**: Google style, required for public functions and classes. +- **Naming**: `snake_case` for functions/variables, `PascalCase` for classes, `UPPER_CASE` + for constants. +- **McCabe complexity**: max 10. +- **Security rules** (ruff S-series): enabled for `src/`, disabled for `tests/`. +- **Password-like string labels**: annotate with `# noqa: S105` when the string is a label or + key name, not an actual secret. + +### Type Checking + +- `ty` type checker via `ty check`. +- Type hints required for all function signatures in `src/`. +- Use `TYPE_CHECKING` guard for type-only imports (avoids circular imports at runtime). + +## Testing Rules + +### Unit Tests + +- **Framework**: pytest + pytest-asyncio (auto mode). +- **Location**: `tests/unit/`. +- **Run all**: `tox run -e unit` +- **Run single test**: `tox run -e unit -- tests/unit/test_charm.py::test_function_name` +- **Coverage**: branch coverage enabled, excludes `logger.debug` lines. +- **Auto-mocked in `conftest.py`**: `charm_refresh.Machines` and `ops.JujuVersion.has_secrets` + (set to `True`). Do not mock these again. +- **Charm instantiation**: uses `ops.testing.Harness`. +- **Test structure**: primarily flat functions. Some files (e.g., `test_watcher_relation.py`) + use test classes — both `::test_function` and `::TestClass::test_method` work with pytest. +- **Exit behavior**: `--exitfirst` is the default (stops on first failure). + +### Integration Tests + +- **Framework**: pytest-operator + jubilant. +- **Location**: `tests/integration/`. +- **Run**: `tox run -e integration -- tests/integration/test_file.py` +- **Requirements**: running Juju controller + cloud credentials (AWS, GCP, or similar). +- **Duration**: minutes to hours — do not run the full suite casually. + +### Testing Expectations + +- Changing `src/X.py` means running `tests/unit/test_X.py`. +- New public methods need corresponding unit tests. +- Do not re-mock what `conftest.py` already handles. + +## Build + +- **Build charm**: `charmcraftcache pack` +- **Format code**: `tox run -e format` +- **Lint**: `tox run -e lint` +- **Unit tests**: `tox run -e unit` +- **Single unit test**: `tox run -e unit -- tests/unit/test_charm.py::test_function_name` +- **Integration tests**: `tox run -e integration -- tests/integration/test_file.py` + +## Workflow Checklist + +Before submitting any change: + +1. Run `tox run -e format` — auto-fix formatting issues. +2. Run `tox run -e lint` — fix all errors (codespell, ruff, shellcheck, ty). +3. Run `tox run -e unit` — ensure all unit tests pass. +4. If Prometheus alert rules were modified, validate with `promtool check rules` and run + `promtool test rules` against test files in `tests/alerts/`. +5. Verify corresponding tests exist for any new or changed behavior. +6. Confirm global constants are in `constants.py`, domain-specific ones in the relevant module. +7. Confirm leader checks are present for any app-scoped relation data writes. From 701c6523b4b391a9c72e75601352758425079437 Mon Sep 17 00:00:00 2001 From: Marcelo Henrique Neppel Date: Tue, 26 May 2026 12:35:40 -0300 Subject: [PATCH 2/5] docs: add CLAUDE.md symlink to AGENTS.md for Claude Code auto-loading Signed-off-by: Marcelo Henrique Neppel --- CLAUDE.md | 1 + 1 file changed, 1 insertion(+) create mode 120000 CLAUDE.md diff --git a/CLAUDE.md b/CLAUDE.md new file mode 120000 index 00000000000..47dc3e3d863 --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1 @@ +AGENTS.md \ No newline at end of file From d5535e1e314de9cf9fe9d04c447c8ddc173c8464 Mon Sep 17 00:00:00 2001 From: Marcelo Henrique Neppel Date: Tue, 26 May 2026 15:15:06 -0300 Subject: [PATCH 3/5] docs(AGENTS.md): add observability directories and clarify templates location Signed-off-by: Marcelo Henrique Neppel --- AGENTS.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/AGENTS.md b/AGENTS.md index 3e273dcc4af..7c79fb04da7 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -25,6 +25,12 @@ PostgreSQL 16 on virtual machines via Patroni for high availability. - **`rotate_logs.py`** — Background process management for log rotation (pgBackRest logs). - **`constants.py`** — Global shared constants (paths, ports, password keys). Domain-specific constants (error messages, local state values) live in their respective modules. +- **`grafana_dashboards/`** — Grafana dashboard JSON definitions for COS integration. +- **`prometheus_alert_rules/`** — Prometheus alerting rule definitions (YAML). +- **`loki_alert_rules/`** — Loki alerting rule definitions. + +The following directories are at the **repository root** (not under `src/`): + - **`templates/`** — Jinja2 templates: `patroni.yml.j2`, `pgbackrest.conf.j2`, `pgbackrest.logrotate.j2`. From 6d871930b25bb8b6bec581cea0b0dc1640fc543d Mon Sep 17 00:00:00 2001 From: Marcelo Henrique Neppel Date: Tue, 26 May 2026 15:32:11 -0300 Subject: [PATCH 4/5] docs(AGENTS.md): add Juju secrets and vendored libraries rules Signed-off-by: Marcelo Henrique Neppel --- AGENTS.md | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/AGENTS.md b/AGENTS.md index 7c79fb04da7..e0be12b6ff7 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -28,6 +28,8 @@ PostgreSQL 16 on virtual machines via Patroni for high availability. - **`grafana_dashboards/`** — Grafana dashboard JSON definitions for COS integration. - **`prometheus_alert_rules/`** — Prometheus alerting rule definitions (YAML). - **`loki_alert_rules/`** — Loki alerting rule definitions. +- **`lib/`** — Vendored charm libraries managed by `charmcraft fetch-lib`. Never modify + these files directly — changes will be overwritten on the next fetch. The following directories are at the **repository root** (not under `src/`): @@ -94,6 +96,13 @@ all substrates (VM and K8s). It provides: `tenacity` for retry logic. Use `Retrying` context manager or `@retry` decorator with appropriate stop/wait strategies. Catch `RetryError` when all retries are exhausted. +13. **Juju secrets** — sensitive data (passwords, TLS keys) must be stored using Juju secrets + via `self.set_secret` / `self.get_secret`. Never store passwords or credentials in plain + relation data. + +14. **Vendored libraries** — the `lib/` directory contains charm libraries managed by + `charmcraft fetch-lib`. Never modify these files — submit fixes upstream instead. + ## Code Quality Rules ### Copyright Header From 20e29ba1ac597ff60f304dc616a3a4b3360bec1c Mon Sep 17 00:00:00 2001 From: Marcelo Henrique Neppel Date: Tue, 26 May 2026 17:28:25 -0300 Subject: [PATCH 5/5] docs(AGENTS.md): consolidate constants in constants.py and clarify testing guidelines Signed-off-by: Marcelo Henrique Neppel --- AGENTS.md | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-) diff --git a/AGENTS.md b/AGENTS.md index e0be12b6ff7..21936f9bf08 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -23,8 +23,8 @@ PostgreSQL 16 on virtual machines via Patroni for high availability. - **`ldap.py`** — LDAP integration via the `LdapRequirer` interface. - **`locales.py`** — Literal type definition of all locales available in the snap. - **`rotate_logs.py`** — Background process management for log rotation (pgBackRest logs). -- **`constants.py`** — Global shared constants (paths, ports, password keys). Domain-specific - constants (error messages, local state values) live in their respective modules. +- **`constants.py`** — All shared constants (paths, ports, password keys, error messages, + local state values). - **`grafana_dashboards/`** — Grafana dashboard JSON definitions for COS integration. - **`prometheus_alert_rules/`** — Prometheus alerting rule definitions (YAML). - **`loki_alert_rules/`** — Loki alerting rule definitions. @@ -70,9 +70,7 @@ all substrates (VM and K8s). It provides: config. Config file rendering happens in `cluster.py` (Patroni YAML) and `backups.py` (pgBackRest conf) using Jinja2 templates from `templates/`. -6. **Constants placement** — global constants shared across modules go in `constants.py`. - Domain-specific constants (error messages, local state values) stay in the module that - uses them. +6. **Constants placement** — all constants go in `constants.py`. 7. **TYPE_CHECKING guard** — use `if TYPE_CHECKING:` for imports needed only by type checkers (especially the charm class in relation handlers to avoid circular imports). @@ -152,10 +150,11 @@ Every file must start with: ### Integration Tests -- **Framework**: pytest-operator + jubilant. +- **Framework**: jubilant (preferred for new tests) and pytest-operator (legacy). - **Location**: `tests/integration/`. - **Run**: `tox run -e integration -- tests/integration/test_file.py` -- **Requirements**: running Juju controller + cloud credentials (AWS, GCP, or similar). +- **Requirements**: running Juju controller. Cloud credentials (AWS, GCP, or similar) are + only needed for backup/restore tests. - **Duration**: minutes to hours — do not run the full suite casually. ### Testing Expectations @@ -183,5 +182,5 @@ Before submitting any change: 4. If Prometheus alert rules were modified, validate with `promtool check rules` and run `promtool test rules` against test files in `tests/alerts/`. 5. Verify corresponding tests exist for any new or changed behavior. -6. Confirm global constants are in `constants.py`, domain-specific ones in the relevant module. +6. Confirm all constants are in `constants.py`. 7. Confirm leader checks are present for any app-scoped relation data writes.