Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
150 changes: 150 additions & 0 deletions monorepo/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,150 @@
# Monorepo Layout (Local Workspace)

This repo is already packaged as a Python distribution named `nightops` (see `pyproject.toml` at the **repository root**).

To let you use it inside a monorepo without moving your existing code, we created this local workspace folder:

- `monorepo/packages/nightops` is a symlink to the repository root (parent of `monorepo/`).

## Setup

From the `monorepo/` directory:

```bash
cd monorepo
python3 -m venv .venv
source .venv/bin/activate

# Install TheNightOps as a package
pip install -e "packages/nightops[dev]"
```

## Using the CLI from the monorepo

The CLI loads config from a path you pass via `--config` (or from `config/nightops.yaml` relative to your current working directory).
So, when you run from `monorepo/`, prefer passing an explicit config path:

```bash
nightops verify --config packages/nightops/config/nightops.yaml
nightops agent run --simple --incident "pod OOMKilled" --config packages/nightops/config/nightops.yaml
```

Alternative: `cd packages/nightops` before running `nightops ...` so `config/nightops.yaml` resolves naturally.

## Adding more packages later

For additional Python packages, create new folders under `monorepo/packages/<your-package>/` and give each package its own `pyproject.toml`.

---

## Publish `nightops` (so other repos can install it)

From the **repository root** (the directory that contains `pyproject.toml`). You can `cd` there explicitly, or:

```bash
REPO_ROOT="$(git rev-parse --show-toplevel)"
cd "$REPO_ROOT"
```

### Step 1: Build artifacts
```bash
# from the TheNightOps repository root
python3 -m pip install -U pip build twine
python3 -m build
```

This generates files in `dist/` (a `.whl` and a `.tar.gz`).

If `python -m build` fails with an error like `Unknown license exception: 'Commons-Clause-1.0'`, make sure `pyproject.toml` uses:
`license = { file = "LICENSE" }`
(this repo has been updated accordingly).

### Step 2: Upload to PyPI (or TestPyPI)

For PyPI:
```bash
twine upload dist/*
```

For TestPyPI:
```bash
twine upload --repository testpypi dist/*
```

You need an API token set up in `~/.pypirc` or via `TWINE_USERNAME` / `TWINE_PASSWORD`.

### Step 3 (before re-publishing): bump version

Update `version = "..."` in `pyproject.toml` and then repeat the build/upload steps.

---

## Install `nightops` in another project

### Step 1: Create/activate a virtualenv
```bash
python3 -m venv .venv
source .venv/bin/activate
```

### Step 2: Install the published package
```bash
pip install nightops
```

Or pin a version:
```bash
pip install "nightops==0.1.0b1"
```

### Step 2b (if using TestPyPI)
```bash
pip install --index-url https://test.pypi.org/simple/ "nightops==0.1.0b1"
```

---

## Using `nightops` in the other project

The CLI expects a config file path via `--config` (or it will look for `config/nightops.yaml` relative to your current working directory).

### Step 1: Copy/create config in the other repo
Create something like:
```text
other-repo/
config/
nightops.yaml
```

### Step 2: Run
From the other project’s root (adjust paths if your layout differs):

```bash
nightops verify --config ./config/nightops.yaml

nightops agent run --simple \
--incident "pod OOMKilled" \
--config ./config/nightops.yaml
```

Or pass an explicit path (works from any working directory):

```bash
OTHER_REPO="/path/to/your/other/project" # set to your checkout
nightops verify --config "$OTHER_REPO/config/nightops.yaml"
```

### Step 3: Optional: run watch mode
```bash
nightops agent watch --simple \
--config ./config/nightops.yaml
```

---

## Important note about config files

`nightops` ships default YAML files inside the installed package, so the CLI can start even if the consuming repo doesn’t provide `config/nightops.yaml`.

In practice, you will still want to provide your own `config/nightops.yaml` in the consuming repo (so you can set your real GCP/Grafana/Slack values) and pass `--config` to override defaults.

1 change: 1 addition & 0 deletions monorepo/packages/nightops
11 changes: 10 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ name = "nightops"
version = "0.1.0b1"
description = "Autonomous SRE Agent built on Google ADK and Remote MCP — with proactive detection, incident memory, and graduated remediation"
readme = "README.md"
license = "Apache-2.0 WITH Commons-Clause-1.0"
license = { file = "LICENSE" }
requires-python = ">=3.11"
authors = [
{ name = "TheNightOps Contributors" },
Expand Down Expand Up @@ -59,6 +59,15 @@ nightops = "nightops.cli:app"

[tool.hatch.build.targets.wheel.force-include]
"src" = "nightops"
"config" = "nightops/config"

Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The runtime code now depends on packaged resources (nightops/config/nightops.yaml). This change force-includes config/ for wheels, but it may still be missing from sdists depending on hatch configuration, which can break installs that build from sdist. Consider also including config/ in the sdist target (or otherwise ensuring all package builds include these YAML resources).

Suggested change
[tool.hatch.build.targets.sdist]
include = [
"src",
"config",
]

Copilot uses AI. Check for mistakes.
# Ensure sdist contains `src/` and `config/` so `pip install` from sdist can build a wheel
# with the packaged YAML defaults (nightops/config/*.yaml).
[tool.hatch.build.targets.sdist]
include = [
"src",
"config",
]

[tool.ruff]
target-version = "py311"
Expand Down
93 changes: 75 additions & 18 deletions src/core/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,55 @@

from __future__ import annotations

import logging
import os
import re
from pathlib import Path
from typing import Literal, Optional
from typing import Any, Literal, Optional

from importlib.resources import files as pkg_files

import yaml
from pydantic import Field, field_validator
from pydantic import Field, ValidationError, field_validator
from pydantic_settings import BaseSettings

logger = logging.getLogger(__name__)


def _coerce_yaml_root(data: object) -> dict[str, Any]:
"""Normalize yaml.safe_load output: None/empty document -> {}, require a mapping."""
if data is None:
return {}
if not isinstance(data, dict):
raise ValueError(f"YAML root must be a mapping, got {type(data).__name__}")
return data


def _merge_mcp_servers_block(data: dict[str, Any]) -> None:
"""Inline `mcp_servers:` into flat fields; map cloud_logging -> cloud_logging_custom."""
if "mcp_servers" not in data:
return
mcp = data.pop("mcp_servers")
if not isinstance(mcp, dict):
raise ValueError("mcp_servers must be a mapping")
if "cloud_logging" in mcp:
mcp["cloud_logging_custom"] = mcp.pop("cloud_logging")
data.update(mcp)


def _parse_yaml_config_text(raw: str) -> dict[str, Any]:
"""Substitute `${VAR}` from the environment, parse YAML, normalize `mcp_servers`.

Shared by `from_yaml` (after optional `config/.env` load) and `from_yaml_text`
(packaged defaults with no adjacent `.env`).
"""
for key, value in os.environ.items():
raw = raw.replace(f"${{{key}}}", value)
raw = re.sub(r"\$\{[A-Za-z_][A-Za-z0-9_]*\}", "", raw)
Comment on lines +47 to +49
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two-pass env substitution can corrupt legitimate values that contain ${...} after replacement (e.g., if an env var value itself contains ${HOME}, the subsequent regex will delete it). A safer approach is a single re.sub pass with a callback that replaces only ${NAME} tokens by looking up NAME in os.environ (defaulting to empty string when missing), and avoids scanning/replacing for every env var.

Suggested change
for key, value in os.environ.items():
raw = raw.replace(f"${{{key}}}", value)
raw = re.sub(r"\$\{[A-Za-z_][A-Za-z0-9_]*\}", "", raw)
pattern = re.compile(r"\$\{([A-Za-z_][A-Za-z0-9_]*)\}")
raw = pattern.sub(lambda m: os.environ.get(m.group(1), ""), raw)

Copilot uses AI. Check for mistakes.
data = _coerce_yaml_root(yaml.safe_load(raw))
_merge_mcp_servers_block(data)
return data


# ── Official Google Cloud MCP Server Configs ────────────────────────

Expand Down Expand Up @@ -348,24 +388,16 @@ def from_yaml(cls, path: str | Path) -> NightOpsConfig:
with open(path) as f:
raw = f.read()

# Substitute environment variables (${VAR_NAME} syntax)
for key, value in os.environ.items():
raw = raw.replace(f"${{{key}}}", value)

# Replace any remaining unresolved ${VAR} references with empty string
raw = re.sub(r"\$\{[A-Za-z_][A-Za-z0-9_]*\}", "", raw)

data = yaml.safe_load(raw)
return cls(**_parse_yaml_config_text(raw))

# Handle nested mcp_servers key if present in YAML
if "mcp_servers" in data:
mcp = data.pop("mcp_servers")
# Map YAML key cloud_logging → cloud_logging_custom to avoid conflict
if "cloud_logging" in mcp:
mcp["cloud_logging_custom"] = mcp.pop("cloud_logging")
data.update(mcp)
@classmethod
def from_yaml_text(cls, raw: str) -> NightOpsConfig:
"""Load configuration from YAML text with environment variable substitution.

return cls(**data)
This is used for packaged-in defaults where there is no filesystem path
alongside the YAML file (so we can't load an adjacent `config/.env`).
"""
return cls(**_parse_yaml_config_text(raw))

@classmethod
def load(cls, config_path: Optional[str | Path] = None) -> NightOpsConfig:
Expand All @@ -378,6 +410,31 @@ def load(cls, config_path: Optional[str | Path] = None) -> NightOpsConfig:
if Path(default_path).exists():
return cls.from_yaml(default_path)

try:
packaged = pkg_files("nightops") / "config" / "nightops.yaml"
except ModuleNotFoundError as exc:
logger.warning(
"Could not resolve packaged config nightops/config/nightops.yaml: %s",
exc,
)
else:
if packaged.is_file():
try:
return cls.from_yaml_text(packaged.read_text())
except (
FileNotFoundError,
IsADirectoryError,
OSError,
UnicodeDecodeError,
yaml.YAMLError,
ValueError,
ValidationError,
) as exc:
logger.warning(
"Failed to load packaged default config nightops.yaml: %s",
exc,
)

# Fall back to environment variables and defaults
project_id = os.getenv("GCP_PROJECT_ID", "")
return cls(
Expand Down
43 changes: 43 additions & 0 deletions src/remediation/policy_engine.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,8 @@
from pathlib import Path
from typing import Any, Optional

from importlib.resources import files as pkg_files

import yaml

from nightops.core.models import RemediationAction, RemediationPolicy, Severity
Expand Down Expand Up @@ -140,6 +142,47 @@ def _load_policies(self, path: str) -> None:
"""Load policies from a YAML file, merging with defaults."""
policy_file = Path(path)
if not policy_file.exists():
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The packaged-default lookup reuses the user-supplied path string verbatim. If callers pass an absolute filesystem path (or a path with drive letters / path separators), the join against package resources becomes ambiguous and can mask a simple 'file not found' misconfiguration. Consider only attempting packaged lookup for expected relative default locations (e.g., when not policy_file.is_absolute()), or normalize to a known resource-relative path before joining.

Suggested change
if not policy_file.exists():
if not policy_file.exists():
# Only attempt packaged lookup for non-absolute paths to avoid
# ambiguities when users pass absolute filesystem locations.
if policy_file.is_absolute():
logger.info("No policy file at %s, using defaults", path)
return

Copilot uses AI. Check for mistakes.
try:
packaged = pkg_files("nightops") / path
except ModuleNotFoundError as exc:
logger.warning(
"Could not resolve packaged remediation policies path %s: %s",
path,
exc,
)
logger.info("No policy file at %s, using defaults", path)
return

if packaged.is_file():
try:
data = yaml.safe_load(packaged.read_text()) or {}
policies = data.get("policies", {})
for action_type, policy_data in policies.items():
self._policies[action_type] = policy_data
logger.info(
"Loaded %d remediation policies from packaged default %s",
len(policies),
path,
)
return
except (
ModuleNotFoundError,
FileNotFoundError,
IsADirectoryError,
OSError,
UnicodeDecodeError,
yaml.YAMLError,
ValueError,
TypeError,
AttributeError,
) as exc:
logger.warning(
"Failed to load packaged default remediation policies from %s: %s",
path,
exc,
)
return

logger.info("No policy file at %s, using defaults", path)
return

Expand Down
Loading