Skip to content

Initial working version#5

Open
wlix13 wants to merge 9 commits intomainfrom
feature/initial-version
Open

Initial working version#5
wlix13 wants to merge 9 commits intomainfrom
feature/initial-version

Conversation

@wlix13
Copy link
Copy Markdown
Owner

@wlix13 wlix13 commented Apr 3, 2026

Pull request type

  • 🐛 Bug fix (fixes an issue with deployment or configuration)
  • New feature (adds new deployment capability or server configuration)
  • 🔄 Enhancement (improves existing deployment or infrastructure setup)
  • 🏗️ Core (changes to the framework or app structure)
  • 🚀 CI/CD (changes to workflows or automation)
  • 📝 Documentation (updates to guides or configuration docs)
  • 🔒 Security (security-related changes)
  • 📦 Dependencies (updates to dependencies)

Description

Why is this change needed?

Related Issues

Testing

  • Tested deployment on development server
  • Verified configuration changes work as expected
  • No testing required (documentation changes only)

Checklist

  • Code follows project style guidelines
  • Self-review completed
  • Linter and type checker pass (uv run ruff check . and uv run ty check)
  • Documentation updated (if applicable)

Security Considerations (if applicable)

  • Security improvement
  • No security impact
  • Potential security risk (explain below)

Version Bumping

  • Version bumped in __init__.py

Summary by CodeRabbit

  • New Features

    • Full server deployment suite: DNS (DoH/DoT), WARP (WireGuard/Masque), Zero Trust tunnels, MTProto, Tor, Xray, HAProxy, containers runtime, firewall/sysctl hardening, swap management, package/locales/timezone setup, user provisioning and shell profiles, plus example inventory presets
    • Multiple service templates, systemd units, periodic checks and helper scripts included
  • Chores

    • Added runtime dependencies and CI workflows
    • Normalized repository ignore rules (updated .gitignore)

@wlix13 wlix13 self-assigned this Apr 3, 2026
@wlix13 wlix13 added needs review Pull request needs code review priority: high High priority issue or PR feature New feature or request labels Apr 3, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 3, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds many new Pydantic models and "molds", host-fact utilities, template assets, pyinfra deployment runes, a feature/system merging utility, an orchestrator that normalizes host features/system and conditionally includes deployment runes, CI workflows, and example inventory and packaging updates.

Changes

Cohort / File(s) Summary
Orchestrator / Foundry
nullforge/foundry/full_cast.py
New cast_full() that ensures/normalizes features and system, then includes a fixed rune sequence and conditionally includes many runes based on feature flags; invoked at import time.
Models (core & feature types)
nullforge/models/...
nullforge/models/__init__.py, nullforge/models/users.py, nullforge/models/system.py, nullforge/models/zerotrust.py, nullforge/models/mtproto.py, nullforge/models/warp.py, nullforge/models/containers.py, nullforge/models/dns.py
Added enums and Pydantic models for shells, swap, zero-trust protocols, mtproto providers, warp engines, container backends, and an extensive DNS model with providers, discriminated unions, and factory helpers.
Molds (schemas & registry)
nullforge/molds/...
nullforge/molds/__init__.py, nullforge/molds/serializable.py, nullforge/molds/base.py, nullforge/molds/features.py, nullforge/molds/system.py, nullforge/molds/cloudflare.py, nullforge/molds/containers.py, nullforge/molds/dns.py, nullforge/molds/haproxy.py, nullforge/molds/mtproto.py, nullforge/molds/netsec.py, nullforge/molds/profiles.py, nullforge/molds/tor.py, nullforge/molds/user.py, nullforge/molds/utils.py, nullforge/molds/xray.py
New BaseMold, many feature/system mold classes, BASE_FEATURES/BASE_SYSTEM, utilities for merging/coercing features/system, and package-level re-exports.
Runes (pyinfra deployments)
nullforge/runes/...
nullforge/runes/prepare.py, nullforge/runes/base.py, nullforge/runes/users.py, nullforge/runes/profiles.py, nullforge/runes/swap.py, nullforge/runes/netsec.py, nullforge/runes/haproxy.py, nullforge/runes/dns.py, nullforge/runes/tor.py, nullforge/runes/warp.py, nullforge/runes/zerotrust.py, nullforge/runes/containers.py, nullforge/runes/cloudflare.py, nullforge/runes/mtproto.py, nullforge/runes/xray.py
New pyinfra deployment modules for preparation, base system, users/profiles, swap, netsec/ufw/sysctl, HAProxy, DNS (DoH/DoT), Tor, WARP (WireGuard/Masque), ZeroTrust tunnel, containers runtime, Cloudflare helpers, MTProto providers, and Xray; many runes execute at import.
Smithy (helpers & facts)
nullforge/smithy/...
nullforge/smithy/__init__.py, nullforge/smithy/admin.py, nullforge/smithy/arch.py, nullforge/smithy/http.py, nullforge/smithy/network.py, nullforge/smithy/packages.py, nullforge/smithy/system.py, nullforge/smithy/versions.py
Host-fact and utility helpers (is_root, arch normalization, has_ipv6, curl arg constants), distribution-aware PackageManager abstraction, locale detection, and Versions URL/version registry.
Templates & Scripts
nullforge/templates/...
Many new Jinja2 templates and executable scripts: cloudflared/tunnel, DNS/resolv/resolved, systemd units/timers (cloudflared, warp, mtg, telemt, healthcheck), zramswap, mtg/telemt config, profiles (zsh/tmux/starship/direnv), warp health/policy/zt-tunnel scripts, and Tor config.
Inventories & packaging
nullforge/inventories/example.py, .gitignore, pyproject.toml
Added example inventory module, updated .gitignore (activated .vscode/, ignore inventories/* with explicit unignore for example files), and added pydantic and pyinfra to project dependencies plus a Ruff ignore.
CI / Release workflows
.github/workflows/...
.github/workflows/ci-code-quality.yaml, .github/workflows/create-tag.yaml
New GitHub Actions workflows: code-quality checks and tag-triggered build/attest/publish/release pipeline.
Package exports / versioning
nullforge/runes/__init__.py
Added __version__ = "0.1.0".

Sequence Diagram(s)

mermaid
sequenceDiagram
participant Host
participant Foundry as full_cast
participant Local as local.include
participant Prep as prepare.py
participant Base as base.py
participant Runes as Runes...
Note left of Foundry: ensures features/system
Host->>Foundry: provide host.data (features, system)
Foundry->>Foundry: ensure_features / ensure_system
Foundry->>Local: include nullforge/runes/prepare.py
Local->>Prep: load & execute prepare()
Foundry->>Local: include nullforge/runes/base.py, .../netsec.py
Local->>Base: load & execute deploy_base_system()
Local->>Runes: conditionally include/execute runes (users, profiles, dns, warp, containers, tor, xray, mtproto, zerotrust, haproxy)
Runes->>Host: apply changes via pyinfra operations (packages, files, systemd, facts)

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Poem

🐇
I hopped through molds and templates bright,
Wove runes and scripts in soft moonlight.
With Pydantic paws and templates spun,
The forge now hums—deployments run!

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/initial-version

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the core infrastructure for NullForge, including configuration models, deployment runes, and utility modules for managing system components like DNS, containers, and shell profiles. The review identified several critical issues: the static-curl installation logic is fragile, the Neovim installation fails on ARM architectures due to file type mismatches, and the cast_full entry point executes immediately upon import. Additionally, security concerns were raised regarding plain-text password handling in shell commands, and minor code quality issues were noted, such as typos and incorrect function usage in the WARP deployment module.

Comment thread nullforge/runes/base.py

server.shell(
name="Extract nvim appimage to target directory",
commands=[f"cd /usr/bin && {nvim_appimage_path} --appimage-extract"],
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This command will fail on arm64 architectures. While the file is saved as /tmp/nvim.appimage (line 438), Versions.nvim_appimage() returns a .tar.gz URL for ARM (line 147 in nullforge/smithy/versions.py). A tarball does not support the --appimage-extract flag. You should handle extraction based on the actual file type or ensure an AppImage is used for all supported architectures.

local.include("nullforge/runes/mtg.py")


cast_full()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Calling cast_full() at the module level causes the deployment logic to execute immediately upon import. This makes the module difficult to reuse or test without triggering side effects. It is recommended to wrap this call in an if __name__ == "__main__": block.

Suggested change
cast_full()
if __name__ == "__main__":
cast_full()

commands=[
f"tar -zxf {tmux_tar_path} -C /tmp/ && rm -f {tmux_tar_path}",
"cd /tmp/tmux-*/ && ./configure",
"cd /tmp/tmux-*/ && make -j$(nproc) && sudo make install",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using sudo inside the shell command string is redundant when the operation is already configured with _sudo=True (line 410). It is better to rely on pyinfra's privilege escalation arguments.

Suggested change
"cd /tmp/tmux-*/ && make -j$(nproc) && sudo make install",
"cd /tmp/tmux-*/ && make -j$(nproc) && make install",

Comment thread nullforge/runes/users.py
server.shell(
name=f"Set password for user {opts.name}",
commands=[
f"printf '%s:%s\\n' '{escaped_username}' '{escaped_password}' | chpasswd",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

Passing a plain-text password via printf in a shell command is insecure, as the password may be visible in the process list (e.g., via ps) or in system logs. It is safer to use pyinfra's server.user with a hashed password or to provide the password via a secure input stream that doesn't expose it to the command line arguments.

Comment thread nullforge/runes/warp.py
Comment on lines +18 to +19
def succes_condition():
return True
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Typo in function name: succes_condition should be success_condition.

Suggested change
def succes_condition():
return True
def success_condition():
return True

Comment thread nullforge/runes/warp.py
)
config_condition = register_op.did_succeed
else:
config_condition = succes_condition
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The function succes_condition is being assigned to config_condition as a function object instead of being called. This will likely cause issues when config_condition is used later (e.g., in _if arguments). You should call the function to get the boolean result.

Suggested change
config_condition = succes_condition
config_condition = success_condition()

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Note

Due to the large number of review comments, Critical severity comments were prioritized as inline comments.

🟠 Major comments (17)
nullforge/smithy/versions.py-26-33 (1)

26-33: ⚠️ Potential issue | 🟠 Major

Pin remote install-script sources to immutable revisions and verify integrity.

These URLs point to mutable install scripts/endpoints. For infrastructure code, this is a supply-chain risk unless pinned (tag/commit) and integrity-checked before execution.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/smithy/versions.py` around lines 26 - 33, The STATIC_URLS
dictionary contains mutable install script URLs (e.g., keys "starship_install",
"docker_install", "xray_install_script", "atuin_install",
"nerd_fonts_installer", "doggo_install") which should be replaced with pinned
immutable references and integrity metadata: update each URL to a specific tag
or commit SHA (or release asset URL) and add accompanying checksum or signature
checks where the script is fetched; modify the code that consumes STATIC_URLS to
validate the downloaded payload against the provided checksum or GPG signature
before execution, and store the checksum/signature alongside each key (or in a
parallel dict) so verify step can reference the correct value.
nullforge/smithy/versions.py-19-19 (1)

19-19: ⚠️ Potential issue | 🟠 Major

Avoid latest for eza to keep deployments reproducible.

Line 19 and Line 100 hard-code latest, which makes builds non-deterministic and can introduce surprise breakage between runs.

Also applies to: 97-108

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/smithy/versions.py` at line 19, The entries using the literal
"latest" for the "eza" key make builds non-deterministic; replace those
hard-coded "latest" values in the versions mapping with a pinned version
constant (e.g., EZA_VERSION) and use that constant wherever the "eza" key is set
(both occurrences in the versions dict / VERSIONS mapping around the "eza" key
and the similar block near lines ~97-108). Add a single source-of-truth constant
(EZA_VERSION) at the top of the module and reference it instead of the string
"latest" so deployments remain reproducible.
nullforge/smithy/arch.py-19-25 (1)

19-25: ⚠️ Potential issue | 🟠 Major

Fail fast on unknown architecture instead of silently defaulting to x86_64/amd64.

Line 25 and Line 37 currently coerce unknown or missing architecture to x86 defaults. That can install the wrong artifacts on non-x86 hosts and fail later during provisioning.

Suggested fix
 def arch_id(host: "Host") -> str:
@@
-    a = host.get_fact(Arch)
-    result = {
+    a = host.get_fact(Arch)
+    result = {
         "x86_64": "x86_64",
         "amd64": "x86_64",
         "aarch64": "arm64",
         "arm64": "arm64",
-    }.get(a, a or "x86_64")
+    }.get(a)
+    if result is None:
+        raise ValueError(f"Unsupported architecture fact: {a!r}")
@@
 def deb_arch(host: "Host") -> str:
@@
-    return {
+    mapped = {
         "x86_64": "amd64",
         "arm64": "arm64",
-    }.get(arch_id(host), "amd64")
+    }.get(arch_id(host))
+    if mapped is None:
+        raise ValueError(f"Unsupported Debian architecture for host: {arch_id(host)!r}")
+    return mapped

Also applies to: 34-37

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/smithy/arch.py` around lines 19 - 25, Replace the silent defaulting
behavior around host.get_fact(Arch) and the result mapping with a fail-fast
check: after retrieving a = host.get_fact(Arch) validate that a is present and
maps to a known canonical arch via the mapping (the current dict mapping
"x86_64"/"amd64"->"x86_64" and "aarch64"/"arm64"->"arm64"); if a is missing or
not in the mapping raise a clear exception (e.g., RuntimeError or ValueError)
that includes the original a/host identifying info instead of returning
"x86_64". Apply the same change to the analogous code at the other occurrence
(the block using the same mapping on lines ~34-37).
nullforge/smithy/system.py-28-33 (1)

28-33: ⚠️ Potential issue | 🟠 Major

Commented locales are being treated as supported locales.

Line 28–33 currently strips # and then appends the locale. In /etc/locale.gen, commented entries are disabled, so this can select unavailable locales and break locale configuration downstream.

Suggested fix
-                if line.startswith("#"):
-                    line = line[1:].strip()
+                if line.startswith("#"):
+                    continue
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/smithy/system.py` around lines 28 - 33, The parser currently strips
leading '#' and then treats commented entries as valid locales; change the logic
so that lines starting with '#' are skipped entirely (do not modify or append to
the locales list). In practice, update the block around line.startswith("#") to
continue/skip early before splitting/using the line so that the code that builds
'parts' and appends to 'locales' only runs for non-commented entries (refer to
the existing 'line.startswith("#")', 'parts = line.split()', and
'locales.append(...)' symbols).
nullforge/templates/__init__.py-13-19 (1)

13-19: ⚠️ Potential issue | 🟠 Major

Add path-boundary validation to prevent template path traversal.

template_name is joined directly and only checked with exists(). Inputs like ../... can escape the templates directory and resolve arbitrary files.

Suggested fix
 def get_template_path(template_name: str) -> str:
@@
-    templates_dir = Path(__file__).parent
-    template_path = templates_dir / template_name
-
-    if not template_path.exists():
+    templates_dir = Path(__file__).parent.resolve()
+    template_path = (templates_dir / template_name).resolve()
+    try:
+        template_path.relative_to(templates_dir)
+    except ValueError as exc:
+        raise FileNotFoundError(f"Template not found: {template_name}") from exc
+
+    if not template_path.is_file():
         raise FileNotFoundError(f"Template not found: {template_name}")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/templates/__init__.py` around lines 13 - 19, The code currently
joins template_name into template_path and only checks exists(), allowing path
traversal like "../" to escape templates_dir; fix by resolving and validating
the boundary: after computing template_path = templates_dir / template_name,
call templates_dir_resolved = templates_dir.resolve() and template_path_resolved
= template_path.resolve(), then ensure template_path_resolved is inside
templates_dir_resolved (use Path.is_relative_to(templates_dir_resolved) or
compare pathlib.Path(templates_dir_resolved) in
str(template_path_resolved).startswith(str(templates_dir_resolved)) /
os.path.commonpath for older Python); also reject absolute template_name inputs
up-front and raise FileNotFoundError or a ValueError if the resolved path is
outside the templates directory before returning str(template_path_resolved).
nullforge/molds/mtg.py-19-22 (1)

19-22: ⚠️ Potential issue | 🟠 Major

Prevent empty MTProto secret when feature is enabled.

Line 20 defaults secret to ""; if install=True, this can produce an insecure or non-functional deployment. Enforce non-empty secret at model validation time (or fail early before rendering/deploying config).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/molds/mtg.py` around lines 19 - 22, The secret Field currently
defaults to an empty string which allows creating a model with install=True and
an empty MTProto secret; add model validation to prevent that. In the pydantic
model that declares secret and install, add a validator (either
`@validator`("secret") or a `@root_validator`) that checks if self.install (or
values.get("install")) is True and secret is empty/whitespace, and raise a
ValueError with a clear message; reference the secret field and the install
attribute so the model fails early before rendering/deploying.
nullforge/templates/scripts/zt-tunnel-warp.sh-20-23 (1)

20-23: ⚠️ Potential issue | 🟠 Major

Make route teardown idempotent in down mode.

With set -e, Line 21 fails hard when a route does not exist. That makes repeated or partial teardowns unreliable.

Proposed fix
   down)
     for cidr in $CIDRS; do
-      ip route del "$cidr" dev "$IFACE"
-      log "Removed route for $cidr via $IFACE"
+      if ip route del "$cidr" dev "$IFACE" 2>/dev/null; then
+        log "Removed route for $cidr via $IFACE"
+      else
+        log "Route for $cidr via $IFACE was already absent"
+      fi
     done
     ;;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/templates/scripts/zt-tunnel-warp.sh` around lines 20 - 23, The
teardown loop that runs "for cidr in $CIDRS; do ip route del \"$cidr\" dev
\"$IFACE\" ..." fails under set -e if a route is missing; make it idempotent by
either checking for the route before deleting (e.g., use "ip route show $cidr
dev $IFACE >/dev/null && ip route del $cidr dev $IFACE") or append "|| true" to
the ip route del command so the delete is allowed to fail harmlessly, and keep
the log invocation "log \"Removed route for $cidr via $IFACE\"" conditional only
when a delete actually occurred.
nullforge/smithy/http.py-3-15 (1)

3-15: ⚠️ Potential issue | 🟠 Major

Add --fail flag to fail on HTTP errors.

Current defaults retry and time out correctly, but they do not cause curl to fail on non-2xx/3xx HTTP responses. Without --fail, HTTP errors (4xx, 5xx) are treated as successful downloads, allowing bad payloads to be saved as valid files.

Add only --fail to CURL_ARGS. The --show-error flag is unnecessary (only useful with -s silent mode, which is not used here), and --location is redundant (all curl invocations already include the -L flag explicitly).

Proposed fix
 CURL_ARGS = {
+    "--fail": " ",
     "--compressed": " ",
     "--retry": "3",
     "--retry-connrefused": " ",
     "--connect-timeout": "10",
     "--max-time": "120",
     "--proto": "=https",
     "--tlsv1.2": " ",
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/smithy/http.py` around lines 3 - 15, CURL_ARGS lacks the --fail
flag so curl currently treats HTTP 4xx/5xx responses as successful; update the
CURL_ARGS dictionary to include "--fail": " " (i.e., add the "--fail" key with a
space value consistent with how flags are represented) so CURL_ARGS_STR will
include --fail in the generated command string; keep all other entries unchanged
(do not add --show-error or --location).
nullforge/molds/tor.py-15-22 (1)

15-22: ⚠️ Potential issue | 🟠 Major

Add port-range validation for Tor ports to prevent runtime failures.

The socks_port and dns_port fields currently accept any integer value. Invalid port numbers (outside the range 1–65535) pass config validation but cause the Tor service to fail at runtime during provisioning.

Add ge=1 and le=65535 constraints to both fields:

Proposed fix
     socks_port: int = Field(
         default=9050,
+        ge=1,
+        le=65535,
         description="The port to use for the Tor proxy",
     )
     dns_port: int = Field(
         default=5353,
+        ge=1,
+        le=65535,
         description="The port to use for the Tor proxy DNS",
     )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/molds/tor.py` around lines 15 - 22, socks_port and dns_port accept
any integer currently, causing invalid port numbers to slip through; add range
validation by updating the Field definitions for socks_port and dns_port in the
Tor config/model (the attributes named socks_port and dns_port) to include ge=1
and le=65535 so Pydantic validates ports are within 1–65535 before runtime.
nullforge/runes/swap.py-97-104 (1)

97-104: ⚠️ Potential issue | 🟠 Major

Pass algo parameter to ZRAM template call.

The zramswap.j2 template requires both size and algo parameters (line 4 uses ALGO={{ algo }}), but the template call only passes size. This will cause a template rendering error.

Template file content
# Algorithm to use.
ALGO={{ algo }}

# Specifies the amount of RAM to use as zram swap.
# If both PERCENT and SIZE are specified, SIZE takes precedence.
{% if size.endswith('%') %}
PERCENT={{ size.rstrip('%') }}
{% else %}
SIZE={{ size }}
{% endif %}

# Priority for the zram swap devices.
PRIORITY=100
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/runes/swap.py` around lines 97 - 104, The template call to
files.template in swap.py is missing the required algo parameter which causes
rendering errors; update the files.template invocation (the block that currently
passes name="Configure zram-tools", src=get_etc_template("default/zramswap.j2"),
dest="/etc/default/zramswap", mode="0644", size=size, _sudo=True) to also pass
algo=algo (or the correct variable holding the desired algorithm) so the
template receives both size and algo.
nullforge/runes/cloudflare.py-57-67 (1)

57-67: ⚠️ Potential issue | 🟠 Major

Downloaded binary lacks integrity verification.

The cloudflared binary is downloaded without checksum or signature verification. While HTTPS provides transport security, a compromised CDN or MITM at the download endpoint could serve a malicious binary. Consider verifying a SHA256 checksum against a known-good value.

Additionally, the temp file /tmp/cloudflared is not cleaned up after installation.

Suggested improvement with cleanup
     server.shell(
         name="Install cloudflared binary",
-        commands=["mv /tmp/cloudflared /usr/bin/cloudflared"],
+        commands=[
+            "mv /tmp/cloudflared /usr/bin/cloudflared",
+        ],
         _sudo=True,
     )

For checksum verification, consider adding a step that downloads and verifies against the official checksum file published by Cloudflare.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/runes/cloudflare.py` around lines 57 - 67, The download/install
block using curl_cmd and server.shell for "Download cloudflared binary" and
"Install cloudflared binary" must verify integrity and clean up temp artifacts:
add a step to also fetch the official SHA256 (or signature) for
Versions(host).cloudflared(), compute and compare the checksum of
/tmp/cloudflared (fail and log error if mismatch), ensure the binary is made
executable (chmod +x /tmp/cloudflared), then move it to /usr/bin/cloudflared
with sudo (the existing mv in the "Install cloudflared binary" shell), and
finally remove any temp files like /tmp/cloudflared after success or failure.
Keep these checks in the same server.shell flow surrounding the curl_cmd and
reference Versions(host).cloudflared(), curl_cmd, and /tmp/cloudflared when
implementing.
nullforge/runes/tor.py-11-17 (1)

11-17: ⚠️ Potential issue | 🟠 Major

Add install guard to deploy_tor().

TorMold has an install: bool field (defaulting to False), but deploy_tor() unconditionally calls _install_tor() without checking it. The deploy_zerotrust_tunnel() function in zerotrust.py follows the correct pattern by checking if not tunnel_opts.install: return before proceeding. Add the same guard here to respect the install flag.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/runes/tor.py` around lines 11 - 17, deploy_tor() currently always
calls _install_tor() even though TorMold includes an install: bool flag; add the
same guard pattern used in deploy_zerotrust_tunnel(): read tor_opts from
host.data.features.tor and if not tor_opts.install then return early, otherwise
call _install_tor(tor_opts) so the install flag is respected.
nullforge/runes/users.py-54-65 (1)

54-65: ⚠️ Potential issue | 🟠 Major

Password logged in process list during execution.

The password is passed as an argument to printf in a shell command. While the shell-quote escaping prevents injection, the password will be visible in the process list (ps aux) during execution. Consider using a more secure approach.

🔒 Recommended alternative using stdin
 def _set_user_password(opts: UserMold) -> None:
     """Set user password."""

-    escaped_username = opts.name.replace("'", "'\\''")
-    escaped_password = opts.password.replace("'", "'\\''")  # type: ignore[union-attr]
-    server.shell(
-        name=f"Set password for user {opts.name}",
-        commands=[
-            f"printf '%s:%s\\n' '{escaped_username}' '{escaped_password}' | chpasswd",
-        ],
-        _sudo=True,
-    )
+    # Use chpasswd with stdin to avoid password in process list
+    import io
+    password_input = f"{opts.name}:{opts.password}\n"
+    files.put(
+        name=f"Set password for user {opts.name}",
+        src=io.StringIO(password_input),
+        dest="/tmp/.nullforge_passwd",
+        mode="0600",
+        _sudo=True,
+    )
+    server.shell(
+        name=f"Apply password for user {opts.name}",
+        commands=[
+            "chpasswd < /tmp/.nullforge_passwd && rm -f /tmp/.nullforge_passwd",
+        ],
+        _sudo=True,
+    )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/runes/users.py` around lines 54 - 65, The current
_set_user_password passes the password as a shell argument (in printf) which
exposes it in the process list; change _set_user_password to avoid including the
password in command arguments by invoking chpasswd directly and feeding
"username:password\n" via the command's stdin (or server.shell's input/stdin
parameter) instead of in the commands list; keep using escaped username
(opts.name) for safety, remove password from the command string, ensure
_sudo=True is preserved and pass the combined
"{escaped_username}:{opts.password}\n" only to stdin.
nullforge/runes/netsec.py-175-203 (1)

175-203: ⚠️ Potential issue | 🟠 Major

sed substitution won't add directives if they don't exist in the file.

The sed commands use substitution (s/pattern/replacement/) which only modifies existing lines. If PasswordAuthentication, PermitRootLogin, or UseDNS are not present in the config (even commented), the sed command will silently do nothing. Consider using a more robust approach that can append missing directives.

🔧 Recommended approach using grep + append
     if add_password_auth:
-        sed_modify = r"s/^[[:space:]]*#?[[:space:]]*PasswordAuthentication[[:space:]]+.*/PasswordAuthentication no/"
         server.shell(
             name="Modify SSH password authentication",
             commands=[
-                f"sed -i -E '{sed_modify}' {ssh_config_path}",
+                f"grep -qE '^[[:space:]]*#?[[:space:]]*PasswordAuthentication' {ssh_config_path} && "
+                f"sed -i -E 's/^[[:space:]]*#?[[:space:]]*PasswordAuthentication[[:space:]]+.*/PasswordAuthentication no/' {ssh_config_path} || "
+                f"echo 'PasswordAuthentication no' >> {ssh_config_path}",
             ],
             _sudo=True,
         )

Apply the same pattern for PermitRootLogin and UseDNS.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/runes/netsec.py` around lines 175 - 203, The sed substitutions in
the add_password_auth/add_root_login/add_usedns blocks (variables sed_modify,
functions calling server.shell) only change existing lines and will do nothing
if the directives are absent; change each server.shell command to a robust
pattern that first tries to replace the directive and if not present appends the
desired directive to ssh_config_path (e.g., use grep -q to check for the
directive and echo "Directive value" >> ssh_config_path when missing, or use awk
to perform replace-or-append), applying this pattern for PasswordAuthentication,
PermitRootLogin, and UseDNS so the configuration is ensured to exist.
nullforge/runes/warp.py-67-71 (1)

67-71: ⚠️ Potential issue | 🟠 Major

Missing _sudo=True for setting executable permissions.

The wgcf binary was moved to /usr/local/bin with sudo, but the subsequent files.file operation to set permissions lacks _sudo=True, which may cause a permission denied error.

🔧 Proposed fix
     files.file(
         name="Set wgcf binary as executable",
         path=opts.engine.binary_path,
         mode="0755",
+        _sudo=True,
     )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/runes/warp.py` around lines 67 - 71, The files.file call that sets
executable permissions for the wgcf binary (files.file with name "Set wgcf
binary as executable" using path opts.engine.binary_path) needs to run with
elevated privileges; add the _sudo=True argument to that files.file invocation
so the chmod operation succeeds when the binary is in /usr/local/bin.
nullforge/runes/base.py-195-198 (1)

195-198: ⚠️ Potential issue | 🟠 Major

Validate system.timezone to prevent shell injection.

The timezone value is passed directly to a shell command without validation. The timezone field in SystemMold accepts arbitrary strings, and server.shell() does not escape command content. A malicious value like UTC; rm -rf / could execute arbitrary commands with sudo privileges.

Add validation to the timezone field:

Suggested fix
 class SystemMold(BaseMold):
     """Full system configuration mold."""
 
     timezone: str = Field(
         default_factory=_default_timezone,
         description="System timezone (e.g. 'UTC' or 'Europe/Amsterdam')",
     )
+
+    `@field_validator`("timezone")
+    `@classmethod`
+    def _validate_timezone(cls, v: str) -> str:
+        """Validate timezone format."""
+        if not v or not all(c.isalnum() or c in "/_+-" for c in v):
+            raise ValueError(f"Invalid timezone format: {v}")
+        return v
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/runes/base.py` around lines 195 - 198, The timezone value is being
interpolated directly into server.shell(...) (see server.shell,
SystemMold.system.timezone) and must be validated to prevent shell injection;
update the code to validate system.timezone against a safe list of IANA
timezones (e.g., using zoneinfo.available_timezones() or a maintained whitelist)
before passing it to the shell, or avoid shell interpolation entirely by using a
call path that accepts arguments (or safely shell-escapes the value) when
invoking timedatectl; reject or raise a validation error for any timezone not in
the approved set and only pass validated values to server.shell.
nullforge/runes/profiles.py-177-181 (1)

177-181: ⚠️ Potential issue | 🟠 Major

Remove relative path - use absolute path for zip file.

The files.file operation removes "FiraCode.zip" using a relative path, which is unreliable. Based on how the nerd-fonts-installer script works, the file is downloaded to a temporary directory. The removal should use an absolute path and include the same privilege escalation parameters as the installation:

🔧 Suggested fix
     files.file(
         name="Remove FiraCode NerdFont zip file",
-        path="FiraCode.zip",
+        path="/tmp/FiraCode.zip",
         present=False,
+        _sudo=True,
+        _sudo_user=user,
     )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/runes/profiles.py` around lines 177 - 181, The files.file call that
removes "FiraCode.zip" uses a relative path; change the call that invokes
files.file(name="Remove FiraCode NerdFont zip file", path="FiraCode.zip",
present=False) to use the absolute temp path where the installer downloaded the
zip (match the temp dir used by the nerd-fonts-installer logic) and include the
same privilege escalation parameters (e.g., become, become_user or sudo-related
kwargs) used during the installation step so the removal runs with the same
privileges.
🟡 Minor comments (15)
nullforge/smithy/system.py-50-53 (1)

50-53: ⚠️ Potential issue | 🟡 Minor

Guard empty preferred locale values before split()[0].

Line 50 can raise IndexError for empty/whitespace entries in preferred.

Suggested fix
     if preferred:
         for pref in preferred:
+            pref = pref.strip()
+            if not pref:
+                continue
             if pref in supported:
                 return pref
             pref_name = pref.split()[0]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/smithy/system.py` around lines 50 - 53, Guard against
empty/whitespace entries in the loop over preferred before calling
pref.split()[0]: inside the loop that iterates over preferred (the variable
pref) trim/validate the string (e.g., pref = pref.strip()) and skip if it's
empty (continue) so pref.split()[0] cannot raise IndexError; then extract the
locale name (pref_name) and keep the existing logic that checks supported and
returns supp when supp.startswith(pref_name).
nullforge/models/containers.py-49-58 (1)

49-58: ⚠️ Potential issue | 🟡 Minor

Rename factory argument type and add an explicit default branch.

The parameter type at line 49 shadows Python's builtin type, a code quality issue. Additionally, the match statement lacks an explicit default case, which would implicitly return None if an unexpected value is encountered at runtime. While ContainersBackendType is a closed enum with only three values, adding an explicit fallback improves defensive programming and provides clearer error handling.

Suggested fix
-def containers_backend_factory(type: ContainersBackendType) -> ContainersBackend:
+def containers_backend_factory(backend_type: ContainersBackendType) -> ContainersBackend:
     """Factory function for containers backends."""
 
-    match type:
+    match backend_type:
         case ContainersBackendType.DOCKER:
             return DockerContainersBackend()
         case ContainersBackendType.PODMAN:
             return PodmanContainersBackend()
         case ContainersBackendType.CRIO:
             return CrioContainersBackend()
+        case _:
+            raise ValueError(f"Unsupported containers backend type: {backend_type}")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/models/containers.py` around lines 49 - 58, Rename the parameter in
containers_backend_factory from the shadowing name "type" to something like
"backend_type" (or similar) and update all references; then add an explicit
fallback branch in the match on ContainersBackendType (after the existing cases
for ContainersBackendType.DOCKER, .PODMAN, .CRIO) that raises a clear error
(e.g., ValueError or TypeError) explaining the invalid backend_type to avoid
implicitly returning None; ensure constructors DockerContainersBackend(),
PodmanContainersBackend(), and CrioContainersBackend() are returned in the
listed cases and that the raised error includes the provided backend_type value
for debugging.
nullforge/models/warp.py-61-66 (1)

61-66: ⚠️ Potential issue | 🟡 Minor

Rename factory argument type to avoid shadowing Python's builtin.

Line 61 shadows Python's builtin type. While WarpEngineType is an exhaustive enum (all cases are covered), adding a default case provides defensive error handling for unexpected inputs.

Suggested fix
-def warp_engine_factory(type: WarpEngineType) -> WarpEngine:
-    match type:
+def warp_engine_factory(engine_type: WarpEngineType) -> WarpEngine:
+    match engine_type:
         case WarpEngineType.MASQUE:
             return MasqueWarpEngine()
         case WarpEngineType.WIREGUARD:
             return WireguardWarpEngine()
+        case _:
+            raise ValueError(f"Unsupported warp engine type: {engine_type}")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/models/warp.py` around lines 61 - 66, The factory function
warp_engine_factory currently shadows the builtin name "type" and lacks a
defensive default; rename the parameter (e.g., engine_type or warp_type) to
avoid builtin shadowing and update all references inside the function, and add a
default branch to the match (or a final else) that raises a clear error (e.g.,
ValueError) when the input isn't a handled WarpEngineType; keep the existing
cases returning MasqueWarpEngine() and WireguardWarpEngine() unchanged.
nullforge/templates/systemd/cloudflare-dns.service.j2-3-4 (1)

3-4: ⚠️ Potential issue | 🟡 Minor

Add After=network-online.target to prevent startup race.

The unit Wants=network-online.target but doesn't include After=network-online.target. Without After=, systemd may start the service before the network is fully online, causing cloudflared to fail on startup.

🔧 Proposed fix
 [Unit]
 Description=Cloudflare DNS-over-HTTPS proxy
 Wants=network-online.target nss-lookup.target
 Before=nss-lookup.target
+After=network-online.target
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/templates/systemd/cloudflare-dns.service.j2` around lines 3 - 4,
The systemd unit currently declares Wants=network-online.target but not
ordering, risking startup before networking is ready; update the
cloudflare-dns.service unit (the template that defines
Wants=network-online.target and Before=nss-lookup.target) to also include
After=network-online.target so the service is ordered to start only after the
network-online target is reached, ensuring cloudflared starts after networking
is up.
nullforge/smithy/admin.py-12-22 (1)

12-22: ⚠️ Potential issue | 🟡 Minor

Add .strip() to the Command fact comparison.

The whoami command outputs a username followed by a newline. Without .strip(), the comparison == "root" will always fail because the returned value is "root\n". All other uses of Command fact comparisons in the codebase explicitly call .strip() before equality checks (e.g., base.py:192, netsec.py:103). Change line 19 to:

result = host.get_fact(Command, command="whoami").strip() == "root"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/smithy/admin.py` around lines 12 - 22, The is_root(host: "Host")
function currently compares the Command fact result to "root" but doesn't strip
the trailing newline; update the comparison in the assignment to result so it
strips whitespace: call host.get_fact(Command, command="whoami").strip() ==
"root" (keep caching via _nullforge_is_root on host.data and retain the rest of
is_root unchanged).
nullforge/templates/profiles/zshrc.j2-1-1 (1)

1-1: ⚠️ Potential issue | 🟡 Minor

Inconsistent home directory reference.

Line 1 uses the Jinja2 template variable {{ home }}, while line 16 uses the shell variable $HOME. If the deployment user's home differs from $HOME at runtime (e.g., deploying to a different user), this could cause Atuin initialization to fail.

Consider using {{ home }} consistently:

-. "$HOME/.atuin/bin/env"
+. "{{ home }}/.atuin/bin/env"

Also applies to: 16-16

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/templates/profiles/zshrc.j2` at line 1, The template mixes Jinja2
{{ home }} and the shell $HOME which is inconsistent; change all occurrences of
the shell variable $HOME in this template to the Jinja2 variable {{ home }}
(e.g., the existing export ZSH="{{ home }}/.oh-my-zsh" and the Atuin
initialization line that currently uses $HOME) so the generated zshrc
consistently references the deployment user's home provided to the template;
update any other $HOME uses in this file to {{ home }} as well.
nullforge/runes/mtg.py-133-140 (1)

133-140: ⚠️ Potential issue | 🟡 Minor

Inconsistent restart behavior with daemon_reload.

daemon_reload correctly uses _if=service_template.did_change, but systemd.service with restarted=True will restart the service unconditionally on every run. This is inconsistent—if the service unit didn't change, there may be no need to restart.

Suggested fix for conditional restart
     systemd.service(
         name="Enable and start mtg service",
         service="mtg",
         running=True,
-        restarted=True,
+        restarted=service_template.did_change,
         enabled=True,
         _sudo=True,
     )

Note: You may also want to restart when the config template changes. In that case, capture the config template result and use any_changed() similar to zerotrust.py.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/runes/mtg.py` around lines 133 - 140, The systemd.service call
named "Enable and start mtg service" currently uses restarted=True which forces
a restart every run; change it to restart conditionally by removing
restarted=True and instead set _if=service_template.did_change (or combine with
a config template result using any_changed() like zerotrust.py) so the service
is only restarted when the unit or related config actually changed (refer to
daemon_reload, service_template, any_changed and the "mtg" service name to
locate and wire the conditional).
nullforge/runes/zerotrust.py-77-85 (1)

77-85: ⚠️ Potential issue | 🟡 Minor

Conditional guard prevents ensuring service state when nothing changed.

The _if=any_changed(...) guards the entire systemd.service operation. If no templates changed, pyinfra won't ensure the service is running or enabled. This means a manually stopped service won't be restarted on subsequent deployments.

Consider separating the restart condition from the ensure-running logic:

Suggested fix
     systemd.service(
         name="Enable and start Zero Trust Tunnel service",
         service="cloudflare-tunnel",
         running=True,
-        restarted=True,
+        restarted=any_changed(config_template, service_template),
         enabled=True,
         _sudo=True,
-        _if=any_changed(config_template, service_template),
     )

This ensures the service is always running and enabled, but only restarts when config changes.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/runes/zerotrust.py` around lines 77 - 85, The systemd.service call
is currently guarded by _if=any_changed(config_template, service_template) which
prevents ensuring the service is running/enabled when nothing changed; split
this into two operations: keep one systemd.service invocation for "Ensure Zero
Trust Tunnel service is running and enabled" with service="cloudflare-tunnel",
running=True, enabled=True, _sudo=True and no _if guard, and add a separate
systemd.service (or a targeted restart operation) for "Restart Zero Trust Tunnel
service on config change" with restarted=True, service="cloudflare-tunnel",
_sudo=True and _if=any_changed(config_template, service_template) so the service
is always ensured but only restarted when templates change.
nullforge/runes/tor.py-40-47 (1)

40-47: ⚠️ Potential issue | 🟡 Minor

Unconditional restart on every deployment run.

Setting restarted=True without conditioning on template changes will restart the Tor service on every deployment, potentially disrupting active connections. Other runes in this PR (e.g., mtg.py) use _if=service_template.did_change to restart only when configuration changes.

Suggested fix to restart only on config change
+    config_template = files.template(
-    files.template(
         name="Deploy Tor proxy configuration",
         src=get_tor_template("torrc.j2"),
         dest="/etc/tor/torrc",
         mode="0644",
         SOCKS_PORT=opts.socks_port,
         DNS_PORT=opts.dns_port,
         _sudo=True,
     )

     systemd.service(
         name="Enable and start Tor",
         service="tor",
         running=True,
-        restarted=True,
+        restarted=config_template.did_change,
         enabled=True,
         _sudo=True,
     )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/runes/tor.py` around lines 40 - 47, The systemd.service call in
tor.py currently uses restarted=True unconditionally which causes Tor to restart
on every deployment; change the call to conditionally restart by adding an _if
that checks the related template's did_change flag (e.g.,
_if=service_template.did_change) so that restarted=True only runs when the Tor
configuration template actually changed; update the systemd.service invocation
(systemd.service(..., restarted=True, ...)) to include the _if condition
referenced from the service template object used for Tor.
nullforge/runes/netsec.py-229-230 (1)

229-230: ⚠️ Potential issue | 🟡 Minor

Sysctl persistence skipped if file exists, regardless of content.

If the persist file exists but contains different/outdated sysctls, the function skips applying new values. This could lead to configuration drift. Consider comparing content or always applying sysctls (pyinfra should be idempotent).

🔧 Suggested approach
     for desc, sysctls, persist_file in persist_map:
         if not sysctls:
             continue

-        if host.get_fact(File, path=persist_file):
-            continue
-
         for key, value in sysctls.items():
             server.sysctl(
                 name=f"Set sysctl {key} ({desc})",
                 key=key,
                 value=str(value),
                 persist=True,
                 persist_file=persist_file,
                 _sudo=True,
             )

Let pyinfra's server.sysctl handle idempotency rather than skipping based on file existence.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/runes/netsec.py` around lines 229 - 230, The current check using
host.get_fact(File, path=persist_file) in the sysctl persistence logic causes
the routine to skip applying sysctls whenever the file at persist_file exists,
even if its contents are outdated; change this so we no longer short-circuit on
file existence—either read and compare the file contents against the desired
sysctl lines or (preferably) always invoke pyinfra's server.sysctl to let it
perform idempotent updates; update the code around host.get_fact(File,
path=persist_file) and the call sites for server.sysctl to remove the
unconditional continue and ensure server.sysctl is called with the desired
sysctl mapping so pyinfra handles idempotency.
nullforge/runes/haproxy.py-139-148 (1)

139-148: ⚠️ Potential issue | 🟡 Minor

ACL on haproxy.cfg may fail if file doesn't exist after fresh install.

setfacl -m u:{username}:rw {haproxy_dir}/haproxy.cfg will fail if the file doesn't exist. On a fresh HAProxy install, the config file should exist, but consider adding a check or using -R for recursive ACLs if the file might be absent.

🛡️ Proposed defensive check
     if f"user:{username}:rwx" not in acl_haproxy:
+        haproxy_cfg = f"{haproxy_dir}/haproxy.cfg"
+        acl_cmds = [
+            f"setfacl -m u:{username}:rwx {haproxy_dir}",
+            f"setfacl -d -m u:{username}:rwx {haproxy_dir}",
+        ]
+        # Only set ACL on config file if it exists
+        if host.get_fact(File, haproxy_cfg):
+            acl_cmds.append(f"setfacl -m u:{username}:rw {haproxy_cfg}")
         server.shell(
             name="Set ACLs on HAProxy config directory",
-            commands=[
-                f"setfacl -m u:{username}:rwx {haproxy_dir}",
-                f"setfacl -d -m u:{username}:rwx {haproxy_dir}",
-                f"setfacl -m u:{username}:rw {haproxy_dir}/haproxy.cfg",
-            ],
+            commands=acl_cmds,
             _sudo=True,
         )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/runes/haproxy.py` around lines 139 - 148, The ACL command may fail
if haproxy.cfg doesn't exist; in the block that checks f"user:{username}:rwx" in
acl_haproxy and calls server.shell with name "Set ACLs on HAProxy config
directory", ensure the file exists before running setfacl on
{haproxy_dir}/haproxy.cfg (or avoid touching the file directly) — either add a
pre-check to create an empty {haproxy_dir}/haproxy.cfg if missing, or replace
the individual file ACL with a recursive/default ACL (e.g., apply -R and -d to
the directory) so setfacl won't error on fresh installs.
nullforge/runes/containers.py-88-88 (1)

88-88: ⚠️ Potential issue | 🟡 Minor

Static analysis flags /tmp usage as potentially insecure.

The Docker install script is downloaded to /tmp/get-docker.sh. While this is a common pattern, a predictable path in /tmp can be subject to symlink attacks if an attacker has local access. For deployment scripts running as root, consider using mktemp for a unique filename.

🛡️ Proposed safer approach
 def _install_docker() -> None:
     """Install Docker using official installation script."""

-    get_docker_path = "/tmp/get-docker.sh"
-    curl_cmd = f"curl -L {CURL_ARGS_STR} {STATIC_URLS['docker_install']} -o {get_docker_path}"
     server.shell(
-        name="Download Docker installation script",
-        commands=[curl_cmd],
-    )
-
-    files.file(
-        name="Set Docker installation script as executable",
-        path=get_docker_path,
-        mode="0755",
-    )
-
-    server.shell(
-        name="Install Docker",
-        commands=[f"bash {get_docker_path}"],
+        name="Install Docker via official script",
+        commands=[
+            f"curl -fsSL {STATIC_URLS['docker_install']} | bash",
+        ],
         _sudo=True,
     )

Note: Piping to bash has its own security considerations but avoids the temp file issue.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/runes/containers.py` at line 88, The hardcoded temp path
get_docker_path = "/tmp/get-docker.sh" is insecure; replace it with a securely
created temporary file instead (use mktemp or Python's
tempfile.NamedTemporaryFile/TemporaryDirectory to generate a unique,
non-predictable filename) and update any code that writes to or executes the
file (references to get_docker_path) to use the returned secure path and to
clean up the file after use; ensure the temp file is created with restrictive
permissions and not vulnerable to symlink races before invoking shell execution.
nullforge/molds/system.py-96-100 (1)

96-100: ⚠️ Potential issue | 🟡 Minor

Redundant Annotated wrapper with conlist in Pydantic v2.

In Pydantic v2, conlist already returns an Annotated type. Wrapping it in another Annotated is redundant. Use either conlist(str, min_length=1) directly as the type annotation, or use list[str] with Field(min_length=1).

🔧 Proposed fix using Field constraints
-    packages_base: Annotated[list[str], conlist(str, min_length=1)] = Field(
+    packages_base: list[str] = Field(
         default_factory=_default_packages_base,
         description="System-wide base packages to install",
+        min_length=1,
     )
-    locales: Annotated[list[str], conlist(str, min_length=1)] = Field(
+    locales: list[str] = Field(
         default_factory=_default_locales,
         description="Locales to generate",
+        min_length=1,
     )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/molds/system.py` around lines 96 - 100, The type annotations for
packages_base and locales use a redundant outer Annotated wrapper around
conlist; replace the Annotated[list[str], conlist(str, min_length=1)] for
packages_base and locales with conlist(str, min_length=1) (or alternatively use
list[str] plus Field(..., min_length=1)) so the field types are the
conlist-returned Annotated type directly and remove the extra Annotated wrapper;
update the annotations on the packages_base and locales declarations (and keep
the existing Field(...) args like default_factory/description).
nullforge/runes/warp.py-18-19 (1)

18-19: ⚠️ Potential issue | 🟡 Minor

Typo in function name: succes_condition should be success_condition.

✏️ Suggested fix
-def succes_condition():
+def success_condition():
     return True

Note: Also update the reference on line 178 to use the corrected name.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/runes/warp.py` around lines 18 - 19, Rename the misspelled function
succes_condition to success_condition and update all references accordingly:
change the function definition named succes_condition() to success_condition()
and update the call/site that references it (the reference noted around line
178) to use success_condition(); ensure tests/imports/any internal callers use
the new name so no NameError occurs.
nullforge/runes/profiles.py-50-59 (1)

50-59: ⚠️ Potential issue | 🟡 Minor

_install_starship is system-wide but called inside per-user loop.

_install_starship(reinstall) installs to /usr/bin/starship or /usr/local/bin/starship, which are system-wide locations. Calling it inside the per-user _install_user_profiles loop means it's invoked multiple times unnecessarily (once per user).

♻️ Suggested fix: Move starship installation to system-level
 def deploy_shell_profiles() -> None:
     """Deploy shell profiles and tools configuration."""

     features: FeaturesMold = host.data.features
     profiles_opts: ProfilesMold = features.profiles
     reinstall = profiles_opts.reinstall

     _install_eza(reinstall)

     _install_tmux(reinstall)

     _install_nvim(reinstall)

+    _install_starship(reinstall)
+
     for user, home_dir in _get_profile_targets(features):
         _install_user_profiles(user, home_dir, reinstall)


 def _install_user_profiles(user: str, home_dir: str, reinstall: bool) -> None:
     """Configure user profile."""

     _configure_user_oh_my_zsh(user, home_dir, reinstall)
     _configure_user_shell_profiles(user, home_dir)
     _install_firacode_font(user, home_dir, reinstall)
     _install_user_tmux(user, home_dir, reinstall)
     _install_user_nvim(user, home_dir, reinstall)
     _install_atuin(user, home_dir, reinstall)
-    _install_starship(reinstall)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/runes/profiles.py` around lines 50 - 59, _install_starship is a
system-wide installer but is being called inside the per-user helper
_install_user_profiles, causing repeated installs; remove the
_install_starship(reinstall) call from _install_user_profiles and invoke
_install_starship once at the system-level (e.g., in the top-level setup or the
function that iterates users) so it runs a single time; ensure the single call
still passes the reinstall flag and that _install_starship itself performs
idempotent checks (existing /usr/bin or /usr/local/bin) before writing.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ce1223dd-7519-4c4b-b591-3b4c5d9b13a7

📥 Commits

Reviewing files that changed from the base of the PR and between 278cd4c and ff2388b.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (75)
  • .gitignore
  • nullforge/foundry/README.md
  • nullforge/foundry/full_cast.py
  • nullforge/inventories/README.md
  • nullforge/inventories/example.py
  • nullforge/models/__init__.py
  • nullforge/models/containers.py
  • nullforge/models/dns.py
  • nullforge/models/system.py
  • nullforge/models/users.py
  • nullforge/models/warp.py
  • nullforge/models/zerotrust.py
  • nullforge/molds/__init__.py
  • nullforge/molds/base.py
  • nullforge/molds/cloudflare.py
  • nullforge/molds/containers.py
  • nullforge/molds/dns.py
  • nullforge/molds/features.py
  • nullforge/molds/haproxy.py
  • nullforge/molds/mtg.py
  • nullforge/molds/netsec.py
  • nullforge/molds/profiles.py
  • nullforge/molds/serializable.py
  • nullforge/molds/system.py
  • nullforge/molds/tor.py
  • nullforge/molds/user.py
  • nullforge/molds/utils.py
  • nullforge/molds/xray.py
  • nullforge/runes/__init__.py
  • nullforge/runes/base.py
  • nullforge/runes/cloudflare.py
  • nullforge/runes/containers.py
  • nullforge/runes/dns.py
  • nullforge/runes/haproxy.py
  • nullforge/runes/mtg.py
  • nullforge/runes/netsec.py
  • nullforge/runes/prepare.py
  • nullforge/runes/profiles.py
  • nullforge/runes/swap.py
  • nullforge/runes/tor.py
  • nullforge/runes/users.py
  • nullforge/runes/warp.py
  • nullforge/runes/xray.py
  • nullforge/runes/zerotrust.py
  • nullforge/smithy/__init__.py
  • nullforge/smithy/admin.py
  • nullforge/smithy/arch.py
  • nullforge/smithy/http.py
  • nullforge/smithy/network.py
  • nullforge/smithy/packages.py
  • nullforge/smithy/system.py
  • nullforge/smithy/versions.py
  • nullforge/templates/__init__.py
  • nullforge/templates/cloudflared/tunnel.yml.j2
  • nullforge/templates/dns/dns.yaml.j2
  • nullforge/templates/dns/resolv.conf.j2
  • nullforge/templates/dns/resolved.conf.j2
  • nullforge/templates/etc/default/zramswap.j2
  • nullforge/templates/mtg/config.toml.j2
  • nullforge/templates/nvim/nvim_patch.lua.j2
  • nullforge/templates/profiles/direnv.toml
  • nullforge/templates/profiles/starship.toml
  • nullforge/templates/profiles/tmux.conf
  • nullforge/templates/profiles/zshrc.j2
  • nullforge/templates/scripts/warp-check.sh
  • nullforge/templates/scripts/warp-v6-policy.sh
  • nullforge/templates/scripts/zt-tunnel-warp.sh
  • nullforge/templates/systemd/cloudflare-dns.service.j2
  • nullforge/templates/systemd/cloudflare-tunnel.service.j2
  • nullforge/templates/systemd/cloudflare-warp-check.service.j2
  • nullforge/templates/systemd/cloudflare-warp-check.timer.j2
  • nullforge/templates/systemd/cloudflare-warp.service.j2
  • nullforge/templates/systemd/mtg.service.j2
  • nullforge/templates/tor/torrc.j2
  • pyproject.toml

Comment thread nullforge/runes/base.py
Comment on lines +40 to +51
with suppress(Exception):
ip_fact = host.get_fact(Command, "hostname -I | awk '{print $1}'")
ip = ip_fact.strip() if ip_fact and ip_fact.strip() else "127.0.1.1"

if host.get_fact(Hostname) != hostname:
server.hostname(
name=f"Set system hostname to {hostname}",
hostname=hostname,
_sudo=True,
)

host_entry = f"{ip} {hostname} {short_hostname}"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Potential NameError: ip variable may be undefined.

If host.get_fact(Command, ...) raises an exception, the suppress(Exception) block catches it but ip is never assigned. This causes a NameError on line 51 when constructing host_entry.

🐛 Proposed fix
-    with suppress(Exception):
-        ip_fact = host.get_fact(Command, "hostname -I | awk '{print $1}'")
-        ip = ip_fact.strip() if ip_fact and ip_fact.strip() else "127.0.1.1"
+    ip = "127.0.1.1"  # Default fallback
+    with suppress(Exception):
+        ip_fact = host.get_fact(Command, "hostname -I | awk '{print $1}'")
+        if ip_fact and ip_fact.strip():
+            ip = ip_fact.strip()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/runes/base.py` around lines 40 - 51, The code uses a suppressed
exception block around host.get_fact(...) so if an exception occurs ip is never
assigned causing a NameError when building host_entry; fix by ensuring ip is
initialized to the default value before the with suppress(Exception) block (or
assign the default in an except/fallback) so that ip always has a value; update
the logic around host.get_fact(Command, ...) / ip_fact / ip to set ip =
"127.0.1.1" by default, then overwrite it with ip_fact.strip() when available,
leaving server.hostname(...) and host_entry construction unchanged.

Comment on lines +75 to +82
else:
raise ValueError("Unsupported distribution for runsc installation")

pm.install(
name="Install gVisor",
packages=["runsc"],
_sudo=True,
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

gVisor package install runs unconditionally after Debian-only repository setup.

The pm.install for runsc (lines 78-82) is outside the if pm.is_debian_based block, but the gVisor repository was only configured for Debian-based systems. On non-Debian systems, this will fail because the repository wasn't added.

🐛 Proposed fix
     if pm.is_debian_based:
         gvisor_key_path = "/usr/share/keyrings/gvisor-archive-keyring.gpg"
         curl_cmd = f"curl -L {CURL_ARGS_STR} {GPG_KEYS['gvisor']} -o {gvisor_key_path}"
         server.shell(
             name="Download gVisor GPG key",
             commands=[curl_cmd],
             _sudo=True,
         )

         gvisor_sources_list = "/etc/apt/sources.list.d/gvisor.list"
         arch = deb_arch(host)
         source = (
             f"deb [arch={arch} signed-by={gvisor_key_path}] "
             "https://storage.googleapis.com/gvisor/releases release main\n"
         )
         files.put(
             name="Write gVisor repository source",
             src=io.StringIO(source),
             dest=gvisor_sources_list,
             create_remote_dir=True,
             mode="0600",
             user="root",
             group="root",
             _sudo=True,
         )

         pm.update(
             name="Update package repositories after adding gVisor repository",
             _sudo=True,
         )
+
+        pm.install(
+            name="Install gVisor",
+            packages=["runsc"],
+            _sudo=True,
+        )
     else:
         raise ValueError("Unsupported distribution for runsc installation")
-
-    pm.install(
-        name="Install gVisor",
-        packages=["runsc"],
-        _sudo=True,
-    )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/runes/containers.py` around lines 75 - 82, The pm.install call that
installs "runsc" (the gVisor package) is executed unconditionally even though
the gVisor apt repository is only configured inside the pm.is_debian_based
branch; move the pm.install block (the call to pm.install with name="Install
gVisor" and packages=["runsc"]) so it runs only when pm.is_debian_based is true
(or add an else branch that handles non-Debian systems appropriately), ensuring
you don't attempt to install runsc on platforms where the repository wasn't
added.

Comment thread nullforge/runes/users.py
Comment on lines +93 to +96
ssh_keys = host.get_fact(FileContents, f"{current_user_home}/.ssh/authorized_keys") or []

# Filter out empty lines and comments
valid_keys = [key for key in ssh_keys if key.strip() and not key.strip().startswith("#")]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify FileContents fact return type in pyinfra documentation/source
ast-grep --pattern 'class FileContents($_) {
  $$$
}'

# Check how FileContents is used elsewhere in the codebase
rg -n 'get_fact\(FileContents' --type=py -C2

Repository: wlix13/NullForge

Length of output: 1161


FileContents returns a string, not a list—iteration will yield characters.

host.get_fact(FileContents, ...) returns str | None, not a list. Line 93 assigns this directly with a fallback to [], so when FileContents succeeds, ssh_keys is a string. Iterating over it on line 96 will yield individual characters, not lines. Split the content into lines first.

🐛 Proposed fix
-    ssh_keys = host.get_fact(FileContents, f"{current_user_home}/.ssh/authorized_keys") or []
+    ssh_keys_content = host.get_fact(FileContents, f"{current_user_home}/.ssh/authorized_keys") or ""
+    ssh_keys = ssh_keys_content.splitlines()
 
     # Filter out empty lines and comments
     valid_keys = [key for key in ssh_keys if key.strip() and not key.strip().startswith("#")]
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
ssh_keys = host.get_fact(FileContents, f"{current_user_home}/.ssh/authorized_keys") or []
# Filter out empty lines and comments
valid_keys = [key for key in ssh_keys if key.strip() and not key.strip().startswith("#")]
ssh_keys_content = host.get_fact(FileContents, f"{current_user_home}/.ssh/authorized_keys") or ""
ssh_keys = ssh_keys_content.splitlines()
# Filter out empty lines and comments
valid_keys = [key for key in ssh_keys if key.strip() and not key.strip().startswith("#")]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/runes/users.py` around lines 93 - 96, The code treats ssh_keys as
an iterable list but host.get_fact(FileContents, ...) returns a string or None;
update the assignment so ssh_content = host.get_fact(FileContents,
f"{current_user_home}/.ssh/authorized_keys") or "" and then derive ssh_keys =
ssh_content.splitlines() (or ssh_content.split("\n")) before filtering; keep the
existing filtering into valid_keys but operate on ssh_keys (string lines) rather
than iterating characters.

Comment thread nullforge/runes/warp.py
Comment on lines +106 to +111
files.link(
name="Link WireGuard profile to /etc/wireguard/warp.conf",
path=wgcf_profile_path,
target="/etc/wireguard/warp.conf",
_sudo=True,
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify pyinfra files.link parameter semantics
rg -n 'files\.link' --type=py -A5 -B2

Repository: wlix13/NullForge

Length of output: 1960


🌐 Web query:

pyinfra files.link documentation parameter path target

💡 Result:

The pyinfra files.link operation is used to add, remove, or update symbolic or hard links on remote hosts. It is documented in the official pyinfra documentation under Files Operations. Signature (from v3.x docs): files.link( path: str, target: str | None = None, present: bool = True, user: str | None = None, group: str | None = None, symbolic: bool = True, create_remote_dir: bool = True, force: bool = False, force_backup: bool = True, force_backup_dir: str | None = None, kwargs ) Key parameters: - path: Path of the link to create (required). - target: Target file or directory the link points to (required for creation). - present: If True, ensure the link exists; if False, remove it. - user: Owner user of the link. - group: Owner group of the link. - symbolic: True for symbolic link (default), False for hard link. - create_remote_dir: Create parent directory if it doesn't exist. - force: Overwrite if target exists and is not a suitable file. - force_backup: Backup existing files when forcing (if True). - force_backup_dir: Directory for backups when forcing. Example: files.link( name="Create link /etc/issue2 that points to /etc/issue", path="/etc/issue2", target="/etc/issue" ) If the link exists but points to a different target, pyinfra removes and recreates it. Inherits global arguments like _sudo. The v2.x signature is similar but lacks type hints and some kwargs like force_backup_dir.

Citations:


🏁 Script executed:

grep -n "wgcf_profile_path" nullforge/runes/warp.py | head -20

Repository: wlix13/NullForge

Length of output: 463


Symlink arguments are swapped — this will break WireGuard configuration.

The files.link parameters are inverted. Currently, it creates a symlink at wgcf_profile_path (the generated config) pointing to /etc/wireguard/warp.conf. Instead, it should create a symlink at /etc/wireguard/warp.conf pointing to wgcf_profile_path, since WireGuard expects the config at the standard location.

Correct syntax:

path="/etc/wireguard/warp.conf",
target=wgcf_profile_path,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/runes/warp.py` around lines 106 - 111, The symlink creation
currently uses inverted arguments in the files.link call: it creates a link at
wgcf_profile_path pointing to "/etc/wireguard/warp.conf"; change the files.link
invocation so the symlink path is "/etc/wireguard/warp.conf" and the target is
wgcf_profile_path (i.e., swap the path and target values in the files.link call
that references wgcf_profile_path) so WireGuard reads the config from the
standard location.

@wlix13 wlix13 force-pushed the feature/initial-version branch from ff2388b to ff4af3c Compare April 4, 2026 20:50
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 20

♻️ Duplicate comments (2)
nullforge/runes/users.py (2)

93-96: ⚠️ Potential issue | 🔴 Critical

FileContents returns a string, not a list—iteration yields characters.

This issue was flagged in a previous review and remains unaddressed. host.get_fact(FileContents, ...) returns str | None, so iterating over ssh_keys on line 96 yields individual characters, not lines.

🐛 Proposed fix
-    ssh_keys = host.get_fact(FileContents, f"{current_user_home}/.ssh/authorized_keys") or []
+    ssh_keys_content = host.get_fact(FileContents, f"{current_user_home}/.ssh/authorized_keys") or ""
+    ssh_keys = ssh_keys_content.splitlines()
 
     # Filter out empty lines and comments
     valid_keys = [key for key in ssh_keys if key.strip() and not key.strip().startswith("#")]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/runes/users.py` around lines 93 - 96, host.get_fact(FileContents,
f"{current_user_home}/.ssh/authorized_keys") returns a string or None, so
iterating ssh_keys produces characters; change handling in the ssh key read path
(the ssh_keys and valid_keys logic) to treat the result as a string: if ssh_keys
is None use empty string, then split into lines (e.g., .splitlines()) and filter
those lines for non-empty and non-comment entries before building valid_keys;
adjust references to ssh_keys and valid_keys accordingly so downstream code
receives a list of key lines.

54-65: ⚠️ Potential issue | 🟠 Major

Fix pipeline failure: opts.password may be None.

The pipeline reports ty check failed (unresolved-attribute) because opts.password has type str | None. Although line 37 guards the call, the type checker can't prove password is non-None inside _set_user_password.

🐛 Proposed fix: Add assertion
 def _set_user_password(opts: UserMold) -> None:
     """Set user password."""
 
+    assert opts.password is not None, "Password must be set before calling _set_user_password"
     escaped_username = opts.name.replace("'", "'\\''")
-    escaped_password = opts.password.replace("'", "'\\''")  # type: ignore[union-attr]
+    escaped_password = opts.password.replace("'", "'\\''")
     server.shell(
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/runes/users.py` around lines 54 - 65, Inside _set_user_password,
opts.password is typed as str | None so the type checker complains when you
access .replace; add a runtime assertion to guarantee non-None before using it
(e.g., assert opts.password is not None) at the top of _set_user_password so the
subsequent escaped_password = opts.password.replace(...) is type-safe, then
proceed to call server.shell as before; reference symbols: function
_set_user_password, class/type UserMold, attribute opts.password, variable
escaped_password, and server.shell.
🧹 Nitpick comments (7)
.github/workflows/create-tag.yaml (1)

19-21: Pin action references to commit SHAs instead of version tags.

Lines 19/21/26/38/44/59/62/78/81 use floating version tags (e.g., @v6, @v7). While these tags currently exist, pinning to commit SHAs is a supply-chain security best practice that ensures consistent, immutable action behavior across runs.

Also applies to: 26-27, 38-39, 44-44, 59-59, 62-63, 78-78, 81-82

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/create-tag.yaml around lines 19 - 21, The workflow uses
floating action tags like actions/checkout@v6 and astral-sh/setup-uv@v7 (and
other occurrences at the cited lines); replace each floating tag reference with
the corresponding commit SHA for that action (for example replacing
actions/checkout@v6 with actions/checkout@<commit-sha>) so the workflow pins to
immutable commits; update every uses: entry (all occurrences of
actions/checkout, astral-sh/setup-uv, and any other actions referenced with `@v`*
in the file) to their verified commit SHAs and commit the changes.
nullforge/smithy/network.py (2)

31-34: Simplify redundant pipeline.

grep -c already outputs just the count; the awk '{print $1}' is unnecessary.

♻️ Simplified command
     ipv6_check = host.get_fact(
         Command,
-        "ip -6 addr show scope global 2>/dev/null | grep -c 'inet6' | awk '{print $1}' || echo '0'",
+        "ip -6 addr show scope global 2>/dev/null | grep -c 'inet6' || echo '0'",
     )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/smithy/network.py` around lines 31 - 34, The command passed to
host.get_fact (ipv6_check) contains a redundant awk step because grep -c already
returns the count; update the Command argument string used in the ipv6_check
assignment to remove the unnecessary " | awk '{print $1}'" so it reads something
like using grep -c ... || echo '0', leaving the rest of the host.get_fact call
and symbols (ipv6_check, host.get_fact, Command) unchanged.

23-28: Consider narrowing exception suppression for better debuggability.

Using suppress(Exception) is very broad and can hide legitimate errors (e.g., permission issues, pyinfra connectivity problems). Consider catching specific exceptions or at minimum logging suppressed errors to aid debugging when IPv6 detection unexpectedly returns False.

♻️ Example: narrow exception handling
-    with suppress(Exception):
-        grub_contents = host.get_fact(FileContents, path=grub_default)
-        # GRUB_CMDLINE_LINUX_DEFAULT="ipv6.disable=1 quiet splash"
-        if grub_contents and "ipv6.disable=1" in grub_contents:
-            setattr(host.data, cache_key, False)
-            return False
+    try:
+        grub_contents = host.get_fact(FileContents, path=grub_default)
+        if grub_contents and "ipv6.disable=1" in grub_contents:
+            setattr(host.data, cache_key, False)
+            return False
+    except (OSError, IOError):
+        pass  # GRUB file may not exist or be readable

Also applies to: 36-42

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/smithy/network.py` around lines 23 - 28, The try/suppress block
around host.get_fact(FileContents, path=grub_default) is too broad and can hide
real errors; replace suppress(Exception) with targeted exception handling (e.g.,
FileNotFoundError, PermissionError, and the specific pyinfra/transport exception
your codebase uses) or at minimum catch Exception as e and log the exception
before treating the result as a missing grub entry; make the change for the
block that inspects grub_contents (the host.get_fact(FileContents,
path=grub_default) call and the subsequent setattr(host.data, cache_key, False)
/ return False), and mirror the same narrower handling/logging for the similar
block around lines 36-42 so failures are recorded rather than silently
suppressed.
nullforge/smithy/packages.py (1)

153-158: Silent removal of apt-specific kwargs could cause unexpected behavior.

When running on RHEL-based systems, apt-specific options like no_recommends, force, extra_uninstall_args are silently dropped. Callers (e.g., dns.py with extra_uninstall_args="--purge") may expect purge semantics that won't be applied. Consider logging a warning when these kwargs are stripped.

♻️ Add warning for stripped kwargs
+import logging
+
+logger = logging.getLogger(__name__)
+
         # Remove apt-specific kwargs if not apt
         if not self.is_debian_based:
+            stripped = [k for k in ("no_recommends", "force", "cache_time", "extra_uninstall_args") if k in call_kwargs]
+            if stripped:
+                logger.debug("Ignoring apt-specific kwargs on %s: %s", self.distro_name, stripped)
             call_kwargs.pop("no_recommends", None)
             call_kwargs.pop("force", None)
             call_kwargs.pop("cache_time", None)
             call_kwargs.pop("extra_uninstall_args", None)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/smithy/packages.py` around lines 153 - 158, The code silently drops
apt-specific kwargs when not self.is_debian_based; update the method that
manipulates call_kwargs to log a warning listing any stripped keys (e.g.,
"no_recommends", "force", "cache_time", "extra_uninstall_args") before popping
them so callers are aware that their options (like
extra_uninstall_args="--purge") will be ignored; reference the same call_kwargs
handling and self.is_debian_based check and emit a single warning mentioning
which keys were removed and that they are apt-specific.
nullforge/runes/mtg.py (2)

67-69: Binary presence check prevents version upgrades.

Once /usr/local/bin/mtg exists, the function returns early without checking versions. If DEFAULT_VERSIONS["mtg"] is bumped, existing deployments won't be updated. Consider:

  • Adding version checking (e.g., mtg --version comparison)
  • Or documenting that manual removal is required for upgrades
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/runes/mtg.py` around lines 67 - 69, The early return triggered by
detecting mtg_exec_path ("/usr/local/bin/mtg") prevents upgrades; change the
logic in the mtg installation routine so that after discovering the binary via
host.get_fact(File, mtg_exec_path) you run a version check (invoke the installed
binary with "mtg --version" or use host.run/Command execution) and compare the
parsed version to DEFAULT_VERSIONS["mtg"]; only return when the installed
version matches the desired version, otherwise proceed with the upgrade/install
path (or remove the old binary before installing). Update the branch around
mtg_exec_path and host.get_fact(File, mtg_exec_path) to perform this version
comparison and conditional reinstall/cleanup.

64-69: Unused parameter opts in _install_mtg.

The opts: MtgMold parameter is passed but never used in the function body.

♻️ Remove unused parameter
-def _install_mtg(opts: MtgMold) -> None:
+def _install_mtg() -> None:
     """Download and install mtg binary from GitHub release tarball."""

And update the call site on line 27:

-    _install_mtg(mtg_opts)
+    _install_mtg()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/runes/mtg.py` around lines 64 - 69, The _install_mtg function
currently declares an unused parameter opts: MtgMold; remove the unused
parameter from the function signature (change def _install_mtg(opts: MtgMold) ->
None: to def _install_mtg() -> None:) and update all call sites that pass an
MtgMold argument to instead call _install_mtg() with no argument; ensure any
imports or type references to MtgMold within this module are adjusted or removed
if they become unused as a result.
nullforge/runes/netsec.py (1)

35-46: Hash the full desired UFW state, not just the rule list.

The early-return fingerprint ignores the fixed defaults and /etc/default/ufw's IPV6= setting. If those drift, _UFW_CHECKSUM_FILE can still match and the firewall is skipped.

Also applies to: 98-105

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/runes/netsec.py` around lines 35 - 46, The checksum currently
computed by _rules_checksum only includes the list of UfwRule objects; update
the checksum calculation to incorporate the full desired UFW state (including
the fixed defaults structure and the parsed /etc/default/ufw IPV6 value) so any
change to defaults or IPV6 causes a new fingerprint; specifically, when building
the payload for hashlib.sha256 in _rules_checksum, append (or include as part of
the JSON serialization) the canonicalised defaults object (the same data used to
write _UFW_CHECKSUM_FILE) and the IPV6 setting read from /etc/default/ufw
(ensure you reference the same parsing/variable used elsewhere so checks match),
then compute the hash over that combined payload so the checksum reflects both
rules and UFW defaults.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/create-tag.yaml:
- Around line 90-92: The gh release create invocation uses an unquoted shell
variable TAG_NAME which can cause word-splitting; update the command that calls
gh release create to quote the variable (use "$TAG_NAME") so it matches the
already-quoted title usage and is defensive against spaces/special characters in
TAG_NAME.

In `@nullforge/models/dns.py`:
- Around line 75-86: Replace the pyright-specific "# type:
ignore[reportAssignmentType]" comments in the ups list construction (where
DnsServerDoH instances are appended to ups in the function building DNS servers,
including the ipv6 branch) with a ty-compatible suppression such as "# ty:
ignore[invalid-argument-type]" or use the generic "# type: ignore" so the ty
checker stops reporting the assignment type error for DnsServerDoH entries.

In `@nullforge/molds/system.py`:
- Around line 12-51: _default_packages_base currently returns Debian-specific
package names and is passed directly into pm.install (via SystemMold / the
caller in runes/base), causing failures on non-Debian hosts; change it to either
accept a distro/package-manager argument or return two lists
(generic_common_packages and distro_specific_packages), move Debian-only names
(e.g., "apt-transport-https", "build-essential", "mtr-tiny") out of the generic
list, and update the caller (where SystemMold calls pm.install) to resolve the
host's package manager and install only the appropriate package list or map
distro-specific names before calling pm.install. Ensure the symbol
_default_packages_base and the installation call site in SystemMold / the
runes/base caller are updated accordingly.
- Around line 124-131: The hostname validator currently checks only whole-string
start/end chars and allowed characters, allowing invalid labels like
"foo-.example.com" or empty labels "a..b.example.com"; update the validator (the
block using variable v that raises ValueError for length, dot presence, and
character checks) to split v by "." and validate each label: ensure no empty
labels (reject consecutive dots), each label length is 1–63, labels do not start
or end with "-" and contain only alphanumeric or "-" (no dots inside labels),
and keep the total length <=253; replace the broad all(c.isalnum() or c in ".-"
...) check with these per-label checks and raise the same ValueError messages as
appropriate.
- Around line 96-103: The Annotated[type, conlist(...)] pattern used for the
packages_base and locales fields does not enforce the min_length validation;
update the field type annotations so they use conlist(str, min_length=1)
directly (i.e., change the annotated types for packages_base and locales to
conlist(str, min_length=1)) while keeping the existing Field(...) calls and
default factories (_default_packages_base and _default_locales) and descriptions
unchanged so Pydantic will validate non-empty lists correctly.

In `@nullforge/runes/base.py`:
- Around line 194-206: The _configure_ntp() implementation hard-codes Debian's
systemd-timesyncd; change it to detect the distro/family and install/enable the
appropriate NTP service instead (e.g., for Debian/Ubuntu install
"systemd-timesyncd" and enable "systemd-timesyncd", for RHEL/CentOS/Fedora
install "chrony" and enable "chronyd" or the distro-appropriate unit), by
branching on the distro info used elsewhere in the module and replacing the
literal pm.install(... packages=["systemd-timesyncd"]) and systemd.service(...
service="systemd-timesyncd") calls with variables chosen per-distro so the
function _configure_ntp() works across distributions.
- Around line 143-159: The current code downloads an unauthenticated shell
script from STATIC_URLS['doggo_install'] to doggo_install_path and executes it
as root via server.shell with _sudo=True, which is unsafe; change the flow to
fetch a pinned release artifact (use a specific versioned URL or release
tarball), validate its integrity and authenticity (verify SHA256 checksum and/or
a GPG signature) before making it executable, and only escalate privileges for
the minimal install steps after verification. Concretely, update the logic
around doggo_install_path, the server.shell call that runs "sh
{doggo_install_path}", and the download step (curl command) to instead download
a pinned artifact and add separate verification steps (checksum/signature
verification) using available utilities on the host, and avoid running arbitrary
installer code as root until verification succeeds. Ensure host.get_fact(File,
doggo_install_path), files.file(... mode="0755"), and the final server.shell
install invocation are adjusted so verification occurs between download and
execution.

In `@nullforge/runes/dns.py`:
- Around line 137-151: The systemd.service call currently gates running/enabled
behind _if=any_changed(config_template, service_template) which prevents
enforcing running=True and enabled=True on subsequent runs; split into two
calls: keep the existing systemd.service that sets restarted=True and uses
_if=any_changed(...), and add a separate systemd.service invocation for
"cloudflare-dns" (or modify the current call) that always enforces running=True
and enabled=True (no _if) so the service is started/enabled every run while
leaving restarts gated by any_changed; reference the existing systemd.service
call and the service name "cloudflare-dns" and parameter keys running, enabled,
restarted, and _if when making the change.
- Around line 47-56: The DOT branch leaves the DoH proxy running; update the
DnsMode handling so _deploy_dot_resolved(opts: DnsMold) first performs DoH
teardown (stop/remove the cloudflare-dns proxy) before setting up DoT. Add a
teardown helper (e.g., _teardown_doh or stop_cloudflare_dns) and call it from
the DnsMode.DOT_RESOLVED case (and any other transition away from DOH), or
invoke that helper at the top of _deploy_dot_resolved; ensure it undoes whatever
_deploy_doh_configuration created (service/process/unit removal and config
clearance) so no cloudflare-dns remains after switching to DOT_RESOLVED.
- Around line 35-41: The property opts.upstream_dns currently filters to DOH
only, causing a mismatch with dns_opts.upstreams (which can be DOH or DOT) and
risking silent wrong-protocol servers in templates; fix by either renaming the
property to upstream_doh and updating all call sites (e.g.,
_deploy_doh_configuration(), any template rendering, and the code paths
referenced around lines 115–123) so it clearly denotes DOH-only behavior, or
make the property protocol-agnostic: store the selected protocol on the mold (or
in dns_opts), change opts.upstream_dns to accept the protocol (or read the
stored protocol) and filter upstreams dynamically based on that value, and
update _deploy_doh_configuration() and any other consumers to use the new
name/signature so upstreams always match the chosen protocol.

In `@nullforge/runes/haproxy.py`:
- Around line 19-25: The haproxy.install flow currently ignores the provided
HAProxy config; after calling _install_haproxy() (and before/after _setup_acls
when appropriate) load haproxy_opts.config from the HaproxyMold and deploy it to
the system haproxy config path (e.g., /etc/haproxy/haproxy.cfg) via the existing
file/template delivery mechanism used elsewhere in the codebase; update the rune
to call a new helper (or extend _install_haproxy) to render/write
haproxy_opts.config and ensure file permissions and service reload/restart are
performed so the inventory-supplied config takes effect.
- Around line 137-158: The current guard uses a single substring check
(acl_haproxy / acl_pki) which can falsely skip when some ACL entries (default
ACL or the file ACL for haproxy.cfg) are missing; update the logic that inspects
acl_haproxy and acl_pki to check separately for the directory access ACL
("user:{username}:rwx"), the directory default ACL
("default:user:{username}:rwx"), and for haproxy.cfg ensure the file ACL
("user:{username}:rw") exists; if any of these specific checks fail, call the
existing server.shell block(s) (the one that sets directory ACLs, the one that
sets default ACLs, and the one that sets haproxy.cfg ACL) so each missing ACL is
applied rather than skipping the entire operation when only some entries are
absent (use the existing variables acl_haproxy, acl_pki, username, haproxy_dir,
pki_dir and the server.shell calls to locate and update the checks).

In `@nullforge/runes/mtg.py`:
- Around line 133-140: The systemd.service call currently forces a restart every
run via restarted=True; change it to restart only when the mtg configuration
template actually changed by capturing the result of the template task that
renders the mtg config (the template(...) call that produces the mtg config) and
passing its .changed flag to systemd.service (e.g.,
restarted=template_result.changed) while keeping enabled=True and running=True;
update the code so the template task returns a variable (e.g., mtg_config) and
the systemd.service invocation uses that variable's .changed property for
restarted rather than True.

In `@nullforge/runes/netsec.py`:
- Around line 171-203: The sed commands in the blocks that set
add_password_auth/add_root_login/add_usedns only modify existing lines so if a
directive is absent they do nothing; update the logic around
add_password_auth/add_root_login/add_usedns to ensure the directive is appended
when missing (e.g., detect absence and run a command to append
"PasswordAuthentication no", "PermitRootLogin no", or "UseDNS yes" to
ssh_config_path) while keeping the existing sed replace behavior for present
lines; modify the server.shell calls (the ones named "Modify SSH password
authentication", "Disable SSH root login", and "Enable DNS resolution for SSH")
to either run a two-step replace-or-append sequence or a single idempotent
awk/ed command that replaces existing entries and adds the line if none exist,
using the existing variables user_opts.manage, sed_modify, and ssh_config_path
to locate the correct file.
- Around line 225-240: The current guard "if host.get_fact(File,
path=persist_file): continue" prevents any future sysctl updates once a
persist_file exists; change the logic so we do not skip entire persist_file when
it exists — iterate each key/value in persist_map regardless and always call
server.sysctl(name=..., key=key, value=str(value), persist=True,
persist_file=persist_file, _sudo=True) for non-empty sysctls; in short, remove
or narrow the host.get_fact(File, path=persist_file) continue so keys are
reconciled on subsequent runs while still keeping the per-key loop in the
persist_map processing (refer to persist_map, host.get_fact(File,
path=persist_file), and server.sysctl).

In `@nullforge/runes/warp.py`:
- Around line 94-120: The current logic wraps files.link and systemd.service
inside the generation conditional (host.get_fact(File, wgcf_profile_path)), so
when a profile already exists reruns won't reconcile the
/etc/wireguard/warp.conf symlink or ensure the service is enabled/started;
change the flow so only the profile generation and post-processing (the wgcf
generate and sed shells) remain conditional on host.get_fact(File,
wgcf_profile_path) while always executing files.link (referencing
wgcf_profile_path) and systemd.service (the call using
opts.engine.systemd_service_name) after that conditional, ensuring the symlink
and systemd unit are reconciled on every run.
- Around line 232-260: The current restart/enable logic only triggers when the
unit file template (service_template) changes, so manual drift or changes in
related config/policy templates won't restart or enable the systemd unit; update
the condition used for systemd.daemon_reload and systemd.service to depend on a
combined "needs_restart" boolean that ORs service_template.did_change with the
did_change flags of the related config/policy templates (or any other config
change indicators), e.g., compute needs_restart = service_template.did_change or
config_template.did_change or policy_template.did_change and use needs_restart
for the _if on systemd.daemon_reload and systemd.service (reference symbols:
service_template, systemd.daemon_reload, systemd.service,
opts.engine.systemd_service_name).
- Around line 266-310: The health-check script is only deployed when missing and
the timer only reconciles when unit templates change; remove the existence guard
and always deploy the script with files.put so updates are applied (replace the
host.get_fact(File, opts.engine.health_check_script) conditional and call
files.put unconditionally for warp-check.sh), capture that files.put result
(e.g., script_put) and include it in the any_changed(...) checks used by
systemd.daemon_reload and systemd.service (add script_put to the any_changed
call alongside service_template and timer_template) so script updates or manual
stops trigger reload/enable on reruns.

In `@nullforge/smithy/packages.py`:
- Line 40: The package mapping mapping "ufw" -> "firewalld" causes RHEL installs
to run firewalld but the code still calls _configure_ufw_firewall(), which
executes UFW CLI commands and will fail; update the netsec rune to detect the
installed firewall and either (A) implement firewalld-specific configuration
logic (create a new function like _configure_firewalld_firewall() and call it
when the mapping results in firewalld / on RHEL) or (B) skip calling
_configure_ufw_firewall() on RHEL and raise NotImplementedError with a clear
message until firewalld support is implemented; ensure you reference the package
mapping entry for "ufw" and the function _configure_ufw_firewall() when making
the change so installation and configuration code paths are consistent.

In `@nullforge/templates/scripts/warp-v6-policy.sh`:
- Around line 50-56: The current logic uses `ip -6 route show default` to find
GW6/DEV which can pick the wrong underlay; instead run `ip -6 route get "$EP6"`
(after `EP6=$(get_ep6)`) and parse the returned fields to extract the correct
`via` (GW6) and `dev` (DEV) chosen for that specific destination, then use those
values in the `ip -6 route replace "${EP6}/128" via "$GW6" dev "$DEV" onlink`
call; ensure you handle cases where `ip -6 route get` returns no `via` (directly
reachable) by using the reported `dev`, and gracefully skip replacing the route
if `ip -6 route get "$EP6"` fails to produce usable `dev`/`via`.

---

Duplicate comments:
In `@nullforge/runes/users.py`:
- Around line 93-96: host.get_fact(FileContents,
f"{current_user_home}/.ssh/authorized_keys") returns a string or None, so
iterating ssh_keys produces characters; change handling in the ssh key read path
(the ssh_keys and valid_keys logic) to treat the result as a string: if ssh_keys
is None use empty string, then split into lines (e.g., .splitlines()) and filter
those lines for non-empty and non-comment entries before building valid_keys;
adjust references to ssh_keys and valid_keys accordingly so downstream code
receives a list of key lines.
- Around line 54-65: Inside _set_user_password, opts.password is typed as str |
None so the type checker complains when you access .replace; add a runtime
assertion to guarantee non-None before using it (e.g., assert opts.password is
not None) at the top of _set_user_password so the subsequent escaped_password =
opts.password.replace(...) is type-safe, then proceed to call server.shell as
before; reference symbols: function _set_user_password, class/type UserMold,
attribute opts.password, variable escaped_password, and server.shell.

---

Nitpick comments:
In @.github/workflows/create-tag.yaml:
- Around line 19-21: The workflow uses floating action tags like
actions/checkout@v6 and astral-sh/setup-uv@v7 (and other occurrences at the
cited lines); replace each floating tag reference with the corresponding commit
SHA for that action (for example replacing actions/checkout@v6 with
actions/checkout@<commit-sha>) so the workflow pins to immutable commits; update
every uses: entry (all occurrences of actions/checkout, astral-sh/setup-uv, and
any other actions referenced with `@v`* in the file) to their verified commit SHAs
and commit the changes.

In `@nullforge/runes/mtg.py`:
- Around line 67-69: The early return triggered by detecting mtg_exec_path
("/usr/local/bin/mtg") prevents upgrades; change the logic in the mtg
installation routine so that after discovering the binary via
host.get_fact(File, mtg_exec_path) you run a version check (invoke the installed
binary with "mtg --version" or use host.run/Command execution) and compare the
parsed version to DEFAULT_VERSIONS["mtg"]; only return when the installed
version matches the desired version, otherwise proceed with the upgrade/install
path (or remove the old binary before installing). Update the branch around
mtg_exec_path and host.get_fact(File, mtg_exec_path) to perform this version
comparison and conditional reinstall/cleanup.
- Around line 64-69: The _install_mtg function currently declares an unused
parameter opts: MtgMold; remove the unused parameter from the function signature
(change def _install_mtg(opts: MtgMold) -> None: to def _install_mtg() -> None:)
and update all call sites that pass an MtgMold argument to instead call
_install_mtg() with no argument; ensure any imports or type references to
MtgMold within this module are adjusted or removed if they become unused as a
result.

In `@nullforge/runes/netsec.py`:
- Around line 35-46: The checksum currently computed by _rules_checksum only
includes the list of UfwRule objects; update the checksum calculation to
incorporate the full desired UFW state (including the fixed defaults structure
and the parsed /etc/default/ufw IPV6 value) so any change to defaults or IPV6
causes a new fingerprint; specifically, when building the payload for
hashlib.sha256 in _rules_checksum, append (or include as part of the JSON
serialization) the canonicalised defaults object (the same data used to write
_UFW_CHECKSUM_FILE) and the IPV6 setting read from /etc/default/ufw (ensure you
reference the same parsing/variable used elsewhere so checks match), then
compute the hash over that combined payload so the checksum reflects both rules
and UFW defaults.

In `@nullforge/smithy/network.py`:
- Around line 31-34: The command passed to host.get_fact (ipv6_check) contains a
redundant awk step because grep -c already returns the count; update the Command
argument string used in the ipv6_check assignment to remove the unnecessary " |
awk '{print $1}'" so it reads something like using grep -c ... || echo '0',
leaving the rest of the host.get_fact call and symbols (ipv6_check,
host.get_fact, Command) unchanged.
- Around line 23-28: The try/suppress block around host.get_fact(FileContents,
path=grub_default) is too broad and can hide real errors; replace
suppress(Exception) with targeted exception handling (e.g., FileNotFoundError,
PermissionError, and the specific pyinfra/transport exception your codebase
uses) or at minimum catch Exception as e and log the exception before treating
the result as a missing grub entry; make the change for the block that inspects
grub_contents (the host.get_fact(FileContents, path=grub_default) call and the
subsequent setattr(host.data, cache_key, False) / return False), and mirror the
same narrower handling/logging for the similar block around lines 36-42 so
failures are recorded rather than silently suppressed.

In `@nullforge/smithy/packages.py`:
- Around line 153-158: The code silently drops apt-specific kwargs when not
self.is_debian_based; update the method that manipulates call_kwargs to log a
warning listing any stripped keys (e.g., "no_recommends", "force", "cache_time",
"extra_uninstall_args") before popping them so callers are aware that their
options (like extra_uninstall_args="--purge") will be ignored; reference the
same call_kwargs handling and self.is_debian_based check and emit a single
warning mentioning which keys were removed and that they are apt-specific.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0a1b8002-614f-4f8a-945e-73d7ba2dd2f4

📥 Commits

Reviewing files that changed from the base of the PR and between ff2388b and ff4af3c.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (77)
  • .github/workflows/ci-code-quality.yaml
  • .github/workflows/create-tag.yaml
  • .gitignore
  • nullforge/foundry/README.md
  • nullforge/foundry/full_cast.py
  • nullforge/inventories/README.md
  • nullforge/inventories/example.py
  • nullforge/models/__init__.py
  • nullforge/models/containers.py
  • nullforge/models/dns.py
  • nullforge/models/system.py
  • nullforge/models/users.py
  • nullforge/models/warp.py
  • nullforge/models/zerotrust.py
  • nullforge/molds/__init__.py
  • nullforge/molds/base.py
  • nullforge/molds/cloudflare.py
  • nullforge/molds/containers.py
  • nullforge/molds/dns.py
  • nullforge/molds/features.py
  • nullforge/molds/haproxy.py
  • nullforge/molds/mtg.py
  • nullforge/molds/netsec.py
  • nullforge/molds/profiles.py
  • nullforge/molds/serializable.py
  • nullforge/molds/system.py
  • nullforge/molds/tor.py
  • nullforge/molds/user.py
  • nullforge/molds/utils.py
  • nullforge/molds/xray.py
  • nullforge/runes/__init__.py
  • nullforge/runes/base.py
  • nullforge/runes/cloudflare.py
  • nullforge/runes/containers.py
  • nullforge/runes/dns.py
  • nullforge/runes/haproxy.py
  • nullforge/runes/mtg.py
  • nullforge/runes/netsec.py
  • nullforge/runes/prepare.py
  • nullforge/runes/profiles.py
  • nullforge/runes/swap.py
  • nullforge/runes/tor.py
  • nullforge/runes/users.py
  • nullforge/runes/warp.py
  • nullforge/runes/xray.py
  • nullforge/runes/zerotrust.py
  • nullforge/smithy/__init__.py
  • nullforge/smithy/admin.py
  • nullforge/smithy/arch.py
  • nullforge/smithy/http.py
  • nullforge/smithy/network.py
  • nullforge/smithy/packages.py
  • nullforge/smithy/system.py
  • nullforge/smithy/versions.py
  • nullforge/templates/__init__.py
  • nullforge/templates/cloudflared/tunnel.yml.j2
  • nullforge/templates/dns/dns.yaml.j2
  • nullforge/templates/dns/resolv.conf.j2
  • nullforge/templates/dns/resolved.conf.j2
  • nullforge/templates/etc/default/zramswap.j2
  • nullforge/templates/mtg/config.toml.j2
  • nullforge/templates/nvim/nvim_patch.lua.j2
  • nullforge/templates/profiles/direnv.toml
  • nullforge/templates/profiles/starship.toml
  • nullforge/templates/profiles/tmux.conf
  • nullforge/templates/profiles/zshrc.j2
  • nullforge/templates/scripts/warp-check.sh
  • nullforge/templates/scripts/warp-v6-policy.sh
  • nullforge/templates/scripts/zt-tunnel-warp.sh
  • nullforge/templates/systemd/cloudflare-dns.service.j2
  • nullforge/templates/systemd/cloudflare-tunnel.service.j2
  • nullforge/templates/systemd/cloudflare-warp-check.service.j2
  • nullforge/templates/systemd/cloudflare-warp-check.timer.j2
  • nullforge/templates/systemd/cloudflare-warp.service.j2
  • nullforge/templates/systemd/mtg.service.j2
  • nullforge/templates/tor/torrc.j2
  • pyproject.toml
✅ Files skipped from review due to trivial changes (36)
  • nullforge/models/init.py
  • nullforge/smithy/init.py
  • nullforge/templates/dns/resolv.conf.j2
  • nullforge/templates/profiles/direnv.toml
  • nullforge/runes/init.py
  • nullforge/templates/dns/dns.yaml.j2
  • nullforge/templates/cloudflared/tunnel.yml.j2
  • nullforge/templates/nvim/nvim_patch.lua.j2
  • .github/workflows/ci-code-quality.yaml
  • nullforge/templates/tor/torrc.j2
  • nullforge/models/users.py
  • nullforge/templates/mtg/config.toml.j2
  • nullforge/molds/xray.py
  • nullforge/templates/systemd/cloudflare-warp-check.timer.j2
  • nullforge/templates/profiles/starship.toml
  • .gitignore
  • nullforge/molds/base.py
  • pyproject.toml
  • nullforge/templates/etc/default/zramswap.j2
  • nullforge/molds/haproxy.py
  • nullforge/models/zerotrust.py
  • nullforge/smithy/arch.py
  • nullforge/molds/profiles.py
  • nullforge/models/system.py
  • nullforge/runes/prepare.py
  • nullforge/templates/dns/resolved.conf.j2
  • nullforge/inventories/example.py
  • nullforge/templates/systemd/cloudflare-warp-check.service.j2
  • nullforge/templates/systemd/mtg.service.j2
  • nullforge/molds/containers.py
  • nullforge/templates/profiles/zshrc.j2
  • nullforge/templates/systemd/cloudflare-dns.service.j2
  • nullforge/templates/systemd/cloudflare-warp.service.j2
  • nullforge/molds/features.py
  • nullforge/runes/containers.py
  • nullforge/molds/tor.py
🚧 Files skipped from review as they are similar to previous changes (23)
  • nullforge/molds/serializable.py
  • nullforge/smithy/admin.py
  • nullforge/templates/systemd/cloudflare-tunnel.service.j2
  • nullforge/molds/mtg.py
  • nullforge/molds/init.py
  • nullforge/templates/profiles/tmux.conf
  • nullforge/templates/scripts/warp-check.sh
  • nullforge/templates/scripts/zt-tunnel-warp.sh
  • nullforge/smithy/system.py
  • nullforge/runes/swap.py
  • nullforge/runes/xray.py
  • nullforge/molds/user.py
  • nullforge/molds/dns.py
  • nullforge/runes/cloudflare.py
  • nullforge/runes/zerotrust.py
  • nullforge/templates/init.py
  • nullforge/molds/cloudflare.py
  • nullforge/models/containers.py
  • nullforge/molds/utils.py
  • nullforge/smithy/http.py
  • nullforge/models/warp.py
  • nullforge/runes/profiles.py
  • nullforge/runes/tor.py

Comment on lines +90 to +92
run: |
gh release create $TAG_NAME \
--title "🚀 Release $TAG_NAME" \
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail
rg -nP 'gh release create \$TAG_NAME\b' .github/workflows/create-tag.yaml

Repository: wlix13/NullForge

Length of output: 102


🏁 Script executed:

head -100 .github/workflows/create-tag.yaml | tail -20

Repository: wlix13/NullForge

Length of output: 594


🏁 Script executed:

grep -n "TAG_NAME" .github/workflows/create-tag.yaml | head -20

Repository: wlix13/NullForge

Length of output: 196


Quote TAG_NAME in the release command for consistency and defensive coding.

Line 91 should quote the variable to avoid word-splitting edge cases. Note that line 92 already quotes TAG_NAME in the title, making the unquoted usage on line 91 inconsistent.

Suggested patch
-          gh release create $TAG_NAME \
+          gh release create "$TAG_NAME" \
🧰 Tools
🪛 actionlint (1.7.12)

[error] 90-90: shellcheck reported issue in this script: SC2086:info:1:19: Double quote to prevent globbing and word splitting

(shellcheck)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/create-tag.yaml around lines 90 - 92, The gh release
create invocation uses an unquoted shell variable TAG_NAME which can cause
word-splitting; update the command that calls gh release create to quote the
variable (use "$TAG_NAME") so it matches the already-quoted title usage and is
defensive against spaces/special characters in TAG_NAME.

Comment thread nullforge/models/dns.py
Comment thread nullforge/molds/system.py
Comment on lines +12 to +51
def _default_packages_base() -> list[str]:
"""Get the default base packages to install."""

return [
"acl",
"aha",
"apt-transport-https",
"bat",
"bind9-host",
"bison",
"btop",
"build-essential",
"direnv",
"dnsutils",
"duf",
"file",
"fontconfig",
"g++",
"gcc",
"git",
"gnupg",
"ifupdown2",
"ipcalc",
"iputils-ping",
"jq",
"libevent-dev",
"locales",
"mtr-tiny",
"ncat",
"ncurses-dev",
"net-tools",
"nmap",
"pkg-config",
"unzip",
"wget",
"whois",
"xsel",
"zoxide",
"zsh",
]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

These defaults are distro-specific but get installed everywhere.

nullforge/runes/base.py:77-82 feeds this list straight into pm.install() for every host. Entries like apt-transport-https, build-essential, and mtr-tiny make the default SystemMold fail on non-Debian targets even though the rest of the codebase already has RHEL branches.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/molds/system.py` around lines 12 - 51, _default_packages_base
currently returns Debian-specific package names and is passed directly into
pm.install (via SystemMold / the caller in runes/base), causing failures on
non-Debian hosts; change it to either accept a distro/package-manager argument
or return two lists (generic_common_packages and distro_specific_packages), move
Debian-only names (e.g., "apt-transport-https", "build-essential", "mtr-tiny")
out of the generic list, and update the caller (where SystemMold calls
pm.install) to resolve the host's package manager and install only the
appropriate package list or map distro-specific names before calling pm.install.
Ensure the symbol _default_packages_base and the installation call site in
SystemMold / the runes/base caller are updated accordingly.

Comment thread nullforge/molds/system.py
Comment on lines +96 to +103
packages_base: Annotated[list[str], conlist(str, min_length=1)] = Field(
default_factory=_default_packages_base,
description="System-wide base packages to install",
)
locales: Annotated[list[str], conlist(str, min_length=1)] = Field(
default_factory=_default_locales,
description="Locales to generate",
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

python -m pip install --quiet 'pydantic>=2,<3'

python - <<'PY'
from typing import Annotated
from pydantic import BaseModel, Field, ValidationError, conlist

class Current(BaseModel):
    value: Annotated[list[str], conlist(str, min_length=1)] = Field(default_factory=list)

class Expected(BaseModel):
    value: conlist(str, min_length=1) = Field(default_factory=list)

for cls in (Current, Expected):
    try:
        cls(value=[])
        print(cls.__name__, "accepted_empty")
    except ValidationError as exc:
        print(cls.__name__, "rejected_empty", exc.errors())
PY

Repository: wlix13/NullForge

Length of output: 352


🏁 Script executed:

cd /repo && cat -n nullforge/molds/system.py | sed -n '90,110p'

Repository: wlix13/NullForge

Length of output: 115


🏁 Script executed:

git ls-files nullforge/molds/system.py

Repository: wlix13/NullForge

Length of output: 85


🏁 Script executed:

cat -n nullforge/molds/system.py | sed -n '90,110p'

Repository: wlix13/NullForge

Length of output: 948


The non-empty list constraint is silently ignored in the current annotation pattern.

Annotated[list[str], conlist(str, min_length=1)] does not enforce the min_length constraint. Testing shows that this pattern accepts empty packages_base and locales values, defeating the intended validation. The direct conlist(str, min_length=1) pattern properly rejects empty lists.

Replace with:

packages_base: conlist(str, min_length=1) = Field(
    default_factory=_default_packages_base,
    description="System-wide base packages to install",
)
locales: conlist(str, min_length=1) = Field(
    default_factory=_default_locales,
    description="Locales to generate",
)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/molds/system.py` around lines 96 - 103, The Annotated[type,
conlist(...)] pattern used for the packages_base and locales fields does not
enforce the min_length validation; update the field type annotations so they use
conlist(str, min_length=1) directly (i.e., change the annotated types for
packages_base and locales to conlist(str, min_length=1)) while keeping the
existing Field(...) calls and default factories (_default_packages_base and
_default_locales) and descriptions unchanged so Pydantic will validate non-empty
lists correctly.

Comment thread nullforge/molds/system.py
Comment on lines +124 to +131
if not v or len(v) > 253:
raise ValueError("hostname must be between 1 and 253 characters")
if "." not in v:
raise ValueError("hostname should be a FQDN (contain a dot)")
if not all(c.isalnum() or c in ".-" for c in v):
raise ValueError("hostname contains invalid characters (allowed: alphanumeric, dots, hyphens)")
if v.startswith((".", "-")) or v.endswith((".", "-")):
raise ValueError("hostname cannot start or end with dot or hyphen")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

hostname validation still accepts invalid labels.

This only rejects bad leading/trailing characters on the whole string, so values like foo-.example.com or a..b.example.com still pass. nullforge/runes/base.py:37-59 then feeds that value into hostnamectl and /etc/hosts.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/molds/system.py` around lines 124 - 131, The hostname validator
currently checks only whole-string start/end chars and allowed characters,
allowing invalid labels like "foo-.example.com" or empty labels
"a..b.example.com"; update the validator (the block using variable v that raises
ValueError for length, dot presence, and character checks) to split v by "." and
validate each label: ensure no empty labels (reject consecutive dots), each
label length is 1–63, labels do not start or end with "-" and contain only
alphanumeric or "-" (no dots inside labels), and keep the total length <=253;
replace the broad all(c.isalnum() or c in ".-" ...) check with these per-label
checks and raise the same ValueError messages as appropriate.

Comment thread nullforge/runes/warp.py
Comment on lines +94 to +120
if not host.get_fact(File, wgcf_profile_path):
server.shell(
name="Generate WireGuard profile",
commands=f"wgcf generate --config {wgcf_account_path} --profile {wgcf_profile_path}",
)

server.shell(
name="Post-process WireGuard profile",
commands=f"sed -i '/^DNS = /d' {wgcf_profile_path} "
rf"&& sed -i '/^\[Interface\]/a Table = off' {wgcf_profile_path}",
)

files.link(
name="Link WireGuard profile to /etc/wireguard/warp.conf",
path=wgcf_profile_path,
target="/etc/wireguard/warp.conf",
_sudo=True,
)

systemd.service(
name="Enable and start WireGuard WARP",
service=opts.engine.systemd_service_name,
running=True,
restarted=True,
enabled=True,
_sudo=True,
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Existing WireGuard profiles skip link/service reconciliation.

Because the symlink creation and systemd.service(...) call live inside if not host.get_fact(File, wgcf_profile_path), reruns never repair /etc/wireguard/warp.conf or start/enable the service when the profile already exists.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/runes/warp.py` around lines 94 - 120, The current logic wraps
files.link and systemd.service inside the generation conditional
(host.get_fact(File, wgcf_profile_path)), so when a profile already exists
reruns won't reconcile the /etc/wireguard/warp.conf symlink or ensure the
service is enabled/started; change the flow so only the profile generation and
post-processing (the wgcf generate and sed shells) remain conditional on
host.get_fact(File, wgcf_profile_path) while always executing files.link
(referencing wgcf_profile_path) and systemd.service (the call using
opts.engine.systemd_service_name) after that conditional, ensuring the symlink
and systemd unit are reconciled on every run.

Comment thread nullforge/runes/warp.py
Comment on lines +232 to +260
service_template = files.template(
name="Deploy Masque WARP service configuration",
src=get_systemd_template("cloudflare-warp.service.j2"),
dest=f"/etc/systemd/system/{opts.engine.systemd_service_name}.service",
mode="0644",
WORKDIR=opts.engine.config_dir,
CONFIG_PATH=opts.engine.config_path,
MTU=opts.mtu,
INET_NAME=opts.iface,
ENABLE_IPV6=ipv6_enabled,
_sudo=True,
_if=config_condition,
)

systemd.daemon_reload(
name="Reload systemd daemon for Masque WARP",
_sudo=True,
_if=service_template.did_change,
)

systemd.service(
name="Enable and start Masque WARP service",
service=opts.engine.systemd_service_name,
running=True,
restarted=True,
enabled=True,
_sudo=True,
_if=service_template.did_change,
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Masque service liveness is tied to unit-file diffs.

With an unchanged service template, reruns will not start/enable cloudflare-warp after manual drift or a previously failed deploy. This also means config/policy changes above are not applied until some separate unit diff forces a restart.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/runes/warp.py` around lines 232 - 260, The current restart/enable
logic only triggers when the unit file template (service_template) changes, so
manual drift or changes in related config/policy templates won't restart or
enable the systemd unit; update the condition used for systemd.daemon_reload and
systemd.service to depend on a combined "needs_restart" boolean that ORs
service_template.did_change with the did_change flags of the related
config/policy templates (or any other config change indicators), e.g., compute
needs_restart = service_template.did_change or config_template.did_change or
policy_template.did_change and use needs_restart for the _if on
systemd.daemon_reload and systemd.service (reference symbols: service_template,
systemd.daemon_reload, systemd.service, opts.engine.systemd_service_name).

Comment thread nullforge/runes/warp.py
Comment on lines +266 to +310
if not host.get_fact(File, opts.engine.health_check_script):
files.put(
name="Deploy WARP health check script",
src=get_script_template("warp-check.sh"),
dest=opts.engine.health_check_script,
user=CLOUDFLARE_USER,
group=CLOUDFLARE_GROUP,
mode="0755",
_sudo=True,
)

service_template = files.template(
name="Deploy WARP health check service",
src=get_systemd_template(f"{opts.engine.systemd_service_name}-check.service.j2"),
dest=f"/etc/systemd/system/{opts.engine.systemd_service_name}-check.service",
mode="0644",
HEALTH_CHECK_SCRIPT=opts.engine.health_check_script,
IFACE=opts.iface,
SERVICE_NAME=opts.engine.systemd_service_name,
_sudo=True,
)

timer_template = files.template(
name="Deploy WARP health check timer",
src=get_systemd_template(f"{opts.engine.systemd_service_name}-check.timer.j2"),
dest=f"/etc/systemd/system/{opts.engine.systemd_service_name}-check.timer",
mode="0644",
SERVICE_NAME=opts.engine.systemd_service_name,
_sudo=True,
)

systemd.daemon_reload(
name="Reload systemd daemon for Masque WARP health check",
_sudo=True,
_if=any_changed(service_template, timer_template),
)

systemd.service(
name="Enable and start Masque WARP health check timer",
service="cloudflare-warp-check.timer",
running=True,
restarted=True,
enabled=True,
_sudo=True,
_if=any_changed(service_template, timer_template),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

The health-check assets never reconcile after first install.

warp-check.sh is only copied when missing, and the timer is only enabled when the unit templates change. Fixes to the script or a manually stopped timer will never be corrected on reruns.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/runes/warp.py` around lines 266 - 310, The health-check script is
only deployed when missing and the timer only reconciles when unit templates
change; remove the existence guard and always deploy the script with files.put
so updates are applied (replace the host.get_fact(File,
opts.engine.health_check_script) conditional and call files.put unconditionally
for warp-check.sh), capture that files.put result (e.g., script_put) and include
it in the any_changed(...) checks used by systemd.daemon_reload and
systemd.service (add script_put to the any_changed call alongside
service_template and timer_template) so script updates or manual stops trigger
reload/enable on reruns.

"pkg-config": "pkgconfig",
"python3-dev": "python3-devel",
"software-properties-common": None,
"ufw": "firewalld",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check how UFW rules are applied in netsec.py
rg -n 'ufw|firewall' nullforge/runes/netsec.py -C5

Repository: wlix13/NullForge

Length of output: 3643


🏁 Script executed:

#!/bin/bash
# Check the context around line 40 in packages.py
head -60 nullforge/smithy/packages.py | tail -30

Repository: wlix13/NullForge

Length of output: 985


🏁 Script executed:

#!/bin/bash
# Look at how get_pm() is implemented and how it uses package overrides
rg -n 'def get_pm' nullforge/smithy/packages.py -A 20

Repository: wlix13/NullForge

Length of output: 416


🏁 Script executed:

#!/bin/bash
# Look at the PackageManager class implementation
rg -n 'class PackageManager' nullforge/smithy/packages.py -A 50

Repository: wlix13/NullForge

Length of output: 1920


🏁 Script executed:

#!/bin/bash
# Look at the install method to see how package mapping is used
rg -n 'def install' nullforge/smithy/packages.py -A 30

Repository: wlix13/NullForge

Length of output: 1133


🏁 Script executed:

#!/bin/bash
# Check if there's any RHEL/distro detection in _configure_ufw_firewall that would prevent execution
rg -n '_configure_ufw_firewall|get_pm|is_rhel|is_debian' nullforge/runes/netsec.py -A 3

Repository: wlix13/NullForge

Length of output: 947


🏁 Script executed:

#!/bin/bash
# Check the full _configure_ufw_firewall function to see if there's any distro check I missed
sed -n '80,150p' nullforge/runes/netsec.py

Repository: wlix13/NullForge

Length of output: 1993


ufwfirewalld mapping causes command execution failures on RHEL systems.

UFW and firewalld are fundamentally different firewall tools with incompatible CLIs. On RHEL systems, when pm.install(packages=["ufw"]) is called, the package mapping (line 40) installs firewalld instead. However, _configure_ufw_firewall() then executes UFW-specific commands (ufw status, ufw --force reset, ufw default deny, ufw enable) which will fail because the ufw binary does not exist.

The mismatch only handles package installation; the actual firewall configuration logic runs UFW commands that are incompatible with firewalld.

Recommended fixes:

  1. Implement firewalld-specific rule handling in the netsec rune
  2. Or skip UFW configuration on RHEL systems and raise NotImplementedError until proper support is added
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/smithy/packages.py` at line 40, The package mapping mapping "ufw"
-> "firewalld" causes RHEL installs to run firewalld but the code still calls
_configure_ufw_firewall(), which executes UFW CLI commands and will fail; update
the netsec rune to detect the installed firewall and either (A) implement
firewalld-specific configuration logic (create a new function like
_configure_firewalld_firewall() and call it when the mapping results in
firewalld / on RHEL) or (B) skip calling _configure_ufw_firewall() on RHEL and
raise NotImplementedError with a clear message until firewalld support is
implemented; ensure you reference the package mapping entry for "ufw" and the
function _configure_ufw_firewall() when making the change so installation and
configuration code paths are consistent.

Comment on lines +50 to +56
EP6=$(get_ep6)
if [ -n "$EP6" ]; then
GW6=$(ip -6 route show default | awk '/default/ {print $3; exit}')
DEV=$(ip -6 route show default | awk '/default/ {print $5; exit}')
if [ -n "${GW6}" ] && [ -n "${DEV}" ]; then
ip -6 route replace "${EP6}/128" via "$GW6" dev "$DEV" onlink
fi
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Resolve the endpoint route from the endpoint, not the first default route.

ip -6 route show default | awk ... exit just picks the first default route. On multi-homed or policy-routed hosts, that can install the endpoint_v6/128 route through the wrong underlay and break the tunnel. Use ip -6 route get "$EP6" and parse the selected via/dev.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/templates/scripts/warp-v6-policy.sh` around lines 50 - 56, The
current logic uses `ip -6 route show default` to find GW6/DEV which can pick the
wrong underlay; instead run `ip -6 route get "$EP6"` (after `EP6=$(get_ep6)`)
and parse the returned fields to extract the correct `via` (GW6) and `dev` (DEV)
chosen for that specific destination, then use those values in the `ip -6 route
replace "${EP6}/128" via "$GW6" dev "$DEV" onlink` call; ensure you handle cases
where `ip -6 route get` returns no `via` (directly reachable) by using the
reported `dev`, and gracefully skip replacing the route if `ip -6 route get
"$EP6"` fails to produce usable `dev`/`via`.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

🧹 Nitpick comments (1)
nullforge/models/dns.py (1)

208-231: Consider table-driven dispatch to reduce branching duplication.

The nested match + if/elif is correct, but repetitive. A mapping-based dispatcher would make future provider/protocol additions less error-prone.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/models/dns.py` around lines 208 - 231, Replace the nested match +
if/elif block that dispatches based on DnsProvider and DnsProtocol with a
table-driven dispatcher: build a mapping (e.g. a dict) keyed by tuples like
(DnsProvider.CLOUDFLARE, DnsProtocol.DOH) -> DnsProviders.cloudflare_doh, etc.,
and look up the factory for (provider, protocol) to call with the appropriate
args (pass ipv6 and include ecs for Quad9 factories); if lookup fails, raise the
same ValueError(s) for unsupported protocol/provider. Update the code that
currently uses DnsProvider, DnsProtocol and DnsProviders in that match block to
use this mapping so additions require only updating the table.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@nullforge/models/dns.py`:
- Around line 200-231: The get_upstreams function accepts an ecs flag but
currently only uses it for DnsProvider.QUAD9, which silently ignores operator
intent for other providers; add a guard at the start of get_upstreams to
validate ecs: if ecs is True and provider is not DnsProvider.QUAD9, raise a
ValueError (include provider and mention ECS is only supported for Quad9) so
callers like deploy_dns_configuration that pass dns_opts.ecs get a clear error
instead of silent no-op; reference the function get_upstreams and the enums
DnsProvider, DnsProtocol and the DnsProviders.* helpers when making the change.

In `@nullforge/molds/mtproto.py`:
- Around line 17-42: The model currently allows invalid combos (empty MTG
secret, empty TeleMT tls_domain/users) and an unbounded port; add validation on
the Mtproto model: set port Field(..., ge=1, le=65535) to bound it, add a
root_validator or field validators that check provider and required
provider-specific fields (when provider == MtprotoProvider.MTG ensure secret is
non-empty; when provider == MtprotoProvider.TELEMT ensure tls_domain is
non-empty and users is non-empty), and validate each users value against a
32-hex pattern (regex ^[0-9a-fA-F]{32}$) raising ValueError with clear messages;
reference the existing symbols provider, port, secret, tls_domain, users, and
use_middle_proxy when adding these validators.

In `@nullforge/runes/mtproto.py`:
- Around line 154-158: The installers (e.g., _install_mtg) currently bail out if
MTG_EXEC exists, which prevents upgrades/downgrades; change the logic to
reconcile against the desired MtprotoMold version rather than mere file
existence: detect the installed binary's version (or a colocated version/sha
file) and compare to opts.version (or Versions pin) and only skip install when
they match; if they differ, remove/replace the binary and proceed with
download+install. Apply the same version-aware check to the other installer
function referenced around lines 232-236 so both use the desired-version
comparison instead of host.get_fact(File, MTG_EXEC) existence checks.
- Around line 222-229: The systemd.service invocation currently forces a restart
on every run via restarted=True; remove restarted=True (leaving running=True and
enabled=True) or gate the restart to config changes by adding
_if=service_template.did_change so the service only restarts when the
unit/template changed; update the systemd.service call(s) (the one with
name="Enable and start mtg service" and the similar block at lines 294-301) to
use one of these two approaches to make the operation idempotent.

In `@nullforge/runes/users.py`:
- Around line 36-40: The current post-create logic incorrectly uses a direct
Python if on user_created.did_succeed (which is a runtime callback) and
unconditionally configures passwordless sudo when user_opts.password is falsy;
change the post-create steps to attach operations using the resource/_action
`_if` parameter (pass the callable user_created.did_succeed without invoking it)
— e.g., move the password-setting and sudo-configuration into
server.shell/server.user calls that use _if=user_created.did_succeed — and only
call/_run _configure_passwordless_sudo when user_opts.sudo is truthy (i.e., use
an elif or guard so _configure_passwordless_sudo(user_opts) runs only if
user_opts.sudo is true).
- Around line 89-93: In copy_root_keys, don't use the current remote user's Home
fact or a pre-deploy Home fact for the new user; change the source home lookup
to host.get_fact(Home, user="root") (replace current_user_home assignment) and
stop using host.get_fact(Home, user=username) for the destination; instead
hard-code the destination home path as f"/home/{username}" (assign to
new_user_home) and compute ssh_keys from the root source path (ssh_keys should
read from the root home’s .ssh/authorized_keys), keeping references to username,
current_user_home, new_user_home, ssh_keys and host.get_fact(Home, ...) to
locate the edits.
- Around line 57-63: The current code uses server.shell with chpasswd and embeds
plaintext (see escaped_username/escaped_password and server.shell call), which
is non-idempotent and insecure; instead compute an encrypted password from
UserMold.password (e.g., use crypt.crypt() with a salt) and replace the
server.shell invocation with the declarative Pyinfra operation server.user(...,
name=opts.name, password=<encrypted_password>), ensuring you pass the encrypted
value to server.user so the operation is idempotent and no plaintext password is
embedded.

In `@nullforge/smithy/versions.py`:
- Around line 105-116: eza_tar() currently always uses "releases/latest" and
ignores any configured version in host.data["versions"]["eza"]; change eza_tar
to check host.data.get("versions", {}).get("eza") (or DEFAULT_VERSIONS fallback)
and if present construct the download URL using that exact version/tag (e.g. use
"{base_url}/download/{version}/..." or the appropriate GitHub release asset
path) and only fall back to "releases/latest" when no version is configured;
update the function (referencing eza_tar and arch_id) to respect and use the
configured version string for the generated tarball URL and raise the same
ValueError for unsupported arch.
- Around line 84-90: The telemt_tar helper currently falls back to raw arch_id()
output and can emit a bogus URL; update telemt_tar to only accept known arches
and reject others: call arch_id(self.host), map it via the existing uname_arch
mapping, and if the arch isn't present, raise a clear exception (e.g.,
ValueError or a custom error) instead of building the URL; reference the
telemt_tar method and arch_id helper so the error surfaces immediately where
telemt_tar is called rather than later during download.

In `@nullforge/templates/telemt/telemt.toml.j2`:
- Around line 20-22: The template emits usernames as bare TOML keys which breaks
for characters like dots, spaces, or @; update telemt.toml.j2 where the USERS
loop iterates (the {% for username, secret in USERS.items() %} block) to render
the key as a quoted TOML string (and escape any embedded quotes) instead of a
bare key — e.g., output a quoted key around {{ username }} (with proper
escaping) so any arbitrary string from USERS is a valid TOML key.

---

Nitpick comments:
In `@nullforge/models/dns.py`:
- Around line 208-231: Replace the nested match + if/elif block that dispatches
based on DnsProvider and DnsProtocol with a table-driven dispatcher: build a
mapping (e.g. a dict) keyed by tuples like (DnsProvider.CLOUDFLARE,
DnsProtocol.DOH) -> DnsProviders.cloudflare_doh, etc., and look up the factory
for (provider, protocol) to call with the appropriate args (pass ipv6 and
include ecs for Quad9 factories); if lookup fails, raise the same ValueError(s)
for unsupported protocol/provider. Update the code that currently uses
DnsProvider, DnsProtocol and DnsProviders in that match block to use this
mapping so additions require only updating the table.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ad66dbc4-7e43-4d19-8143-2800c6ed5691

📥 Commits

Reviewing files that changed from the base of the PR and between ff4af3c and 071baed.

📒 Files selected for processing (13)
  • nullforge/foundry/full_cast.py
  • nullforge/models/dns.py
  • nullforge/models/mtproto.py
  • nullforge/molds/__init__.py
  • nullforge/molds/features.py
  • nullforge/molds/mtproto.py
  • nullforge/molds/utils.py
  • nullforge/runes/mtproto.py
  • nullforge/runes/users.py
  • nullforge/smithy/versions.py
  • nullforge/templates/__init__.py
  • nullforge/templates/systemd/telemt.service.j2
  • nullforge/templates/telemt/telemt.toml.j2
✅ Files skipped from review due to trivial changes (1)
  • nullforge/templates/systemd/telemt.service.j2
🚧 Files skipped from review as they are similar to previous changes (5)
  • nullforge/foundry/full_cast.py
  • nullforge/molds/init.py
  • nullforge/molds/features.py
  • nullforge/templates/init.py
  • nullforge/molds/utils.py

Comment thread nullforge/models/dns.py
Comment on lines +200 to +231
def get_upstreams(
provider: DnsProvider,
protocol: DnsProtocol,
ipv6: bool,
ecs: bool = False,
) -> list[DnsServer]:
"""Get upstream DNS servers for a provider and protocol."""

match provider:
case DnsProvider.CLOUDFLARE:
if protocol == DnsProtocol.DOH:
return DnsProviders.cloudflare_doh(ipv6)
elif protocol == DnsProtocol.DOT:
return DnsProviders.cloudflare_dot(ipv6)
else:
raise ValueError(f"Unsupported protocol {protocol} for Cloudflare")
case DnsProvider.GOOGLE:
if protocol == DnsProtocol.DOH:
return DnsProviders.google_doh(ipv6)
elif protocol == DnsProtocol.DOT:
return DnsProviders.google_dot(ipv6)
else:
raise ValueError(f"Unsupported protocol {protocol} for Google")
case DnsProvider.QUAD9:
if protocol == DnsProtocol.DOH:
return DnsProviders.quad9_doh(ipv6, ecs)
elif protocol == DnsProtocol.DOT:
return DnsProviders.quad9_dot(ipv6, ecs)
else:
raise ValueError(f"Unsupported protocol {protocol} for Quad9")
case _:
raise ValueError(f"Unknown provider: {provider}")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

ECS flag is silently ignored for non-Quad9 providers.

ecs is accepted on Line 204 for all providers, but in this function it only affects DnsProvider.QUAD9. Since deploy_dns_configuration always passes dns_opts.ecs, this can hide invalid operator intent for Cloudflare/Google.

Proposed guard
 `@staticmethod`
 def get_upstreams(
     provider: DnsProvider,
     protocol: DnsProtocol,
     ipv6: bool,
     ecs: bool = False,
 ) -> list[DnsServer]:
     """Get upstream DNS servers for a provider and protocol."""
+    if ecs and provider != DnsProvider.QUAD9:
+        raise ValueError("ECS is supported only with Quad9 upstreams")
 
     match provider:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/models/dns.py` around lines 200 - 231, The get_upstreams function
accepts an ecs flag but currently only uses it for DnsProvider.QUAD9, which
silently ignores operator intent for other providers; add a guard at the start
of get_upstreams to validate ecs: if ecs is True and provider is not
DnsProvider.QUAD9, raise a ValueError (include provider and mention ECS is only
supported for Quad9) so callers like deploy_dns_configuration that pass
dns_opts.ecs get a clear error instead of silent no-op; reference the function
get_upstreams and the enums DnsProvider, DnsProtocol and the DnsProviders.*
helpers when making the change.

Comment thread nullforge/molds/mtproto.py Outdated
Comment on lines +17 to +42
provider: MtprotoProvider = Field(
default=MtprotoProvider.MTG,
description="MTProto proxy provider",
)
port: int = Field(
default=443,
description="The port the proxy listens on",
)
# mtg provider fields
secret: str = Field(
default="",
description="MTProto secret for mtg",
)
# telemt provider fields
tls_domain: str = Field(
default="",
description="TLS masking domain for telemt",
)
users: dict[str, str] = Field(
default_factory=dict,
description="Username→hex-secret mapping for telemt (32 hex chars each)",
)
use_middle_proxy: bool = Field(
default=False,
description="Enable middle proxy mode for telemt",
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Validate provider-specific MTProto settings in the mold.

These defaults allow provider=mtg with secret="" and provider=telemt with empty tls_domain / users. nullforge/runes/mtproto.py, Lines 30-43 consumes those values directly, so the bad state only shows up after the deploy starts mutating the host. Please also bound port to 1..65535 and enforce the documented 32-hex format for TeleMT user secrets here.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/molds/mtproto.py` around lines 17 - 42, The model currently allows
invalid combos (empty MTG secret, empty TeleMT tls_domain/users) and an
unbounded port; add validation on the Mtproto model: set port Field(..., ge=1,
le=65535) to bound it, add a root_validator or field validators that check
provider and required provider-specific fields (when provider ==
MtprotoProvider.MTG ensure secret is non-empty; when provider ==
MtprotoProvider.TELEMT ensure tls_domain is non-empty and users is non-empty),
and validate each users value against a 32-hex pattern (regex ^[0-9a-fA-F]{32}$)
raising ValueError with clear messages; reference the existing symbols provider,
port, secret, tls_domain, users, and use_middle_proxy when adding these
validators.

Comment on lines +154 to +158
def _install_mtg(opts: MtprotoMold) -> None:
"""Download and install mtg binary from GitHub release tarball."""

if host.get_fact(File, MTG_EXEC):
return
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Reconcile installed binaries against the desired version, not just file existence.

Both installers return as soon as the target path exists, so changing Versions pins or per-host overrides never updates an already-provisioned host. That turns upgrades and downgrades into a manual step and leaves long-lived nodes drifting from declared state.

Also applies to: 232-236

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/runes/mtproto.py` around lines 154 - 158, The installers (e.g.,
_install_mtg) currently bail out if MTG_EXEC exists, which prevents
upgrades/downgrades; change the logic to reconcile against the desired
MtprotoMold version rather than mere file existence: detect the installed
binary's version (or a colocated version/sha file) and compare to opts.version
(or Versions pin) and only skip install when they match; if they differ,
remove/replace the binary and proceed with download+install. Apply the same
version-aware check to the other installer function referenced around lines
232-236 so both use the desired-version comparison instead of
host.get_fact(File, MTG_EXEC) existence checks.

Comment on lines +222 to +229
systemd.service(
name="Enable and start mtg service",
service="mtg",
running=True,
restarted=True,
enabled=True,
_sudo=True,
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, locate and examine the file
fd -a "mtproto.py" -type f

Repository: wlix13/NullForge

Length of output: 230


🏁 Script executed:

# Read the full file to understand context
wc -l nullforge/runes/mtproto.py

Repository: wlix13/NullForge

Length of output: 90


🏁 Script executed:

# Read the specific lines mentioned in the review (222-229)
sed -n '220,235p' nullforge/runes/mtproto.py

Repository: wlix13/NullForge

Length of output: 401


🏁 Script executed:

# Also read lines 294-301
sed -n '292,305p' nullforge/runes/mtproto.py

Repository: wlix13/NullForge

Length of output: 274


🏁 Script executed:

# Get broader context around both sections to understand conditions
sed -n '200,240p' nullforge/runes/mtproto.py

Repository: wlix13/NullForge

Length of output: 1035


🏁 Script executed:

# Check the 294-301 context
sed -n '280,310p' nullforge/runes/mtproto.py

Repository: wlix13/NullForge

Length of output: 611


🌐 Web query:

pyinfra systemd.service restarted parameter behavior idempotent

💡 Result:

In pyinfra systemd.service, the restarted=True argument is not “restart only if something changed”—it means “restart the unit (if it’s already running)”.

Behavior (pyinfra 3.x/2.x docs & code):

  • systemd.service(..., restarted=True, ...) will emit a systemctl restart <unit> only when pyinfra believes the unit is currently running (based on the SystemdStatus fact). If the unit is not running, pyinfra will not restart it (it may start it if running=True). [1][2][3]
  • Because a restart command is generated whenever the service is running, restarted=True is effectively non-idempotent across repeated runs (it keeps restarting on every run, as long as the unit is running). This is consistent with the underlying helper: “only restart if already running”. [3]

How to make “restart only when something changed”:

  • Gate the restart on the upstream operation’s change status, e.g.:
    • op = files.put(...)
    • if op.changed: systemd.service(..., restarted=True) (pattern described for pyinfra operations generally). [4]

Sources: pyinfra systemd operation docs [1], pyinfra systemd.service implementation [2], shared handle_service_control logic that triggers restart when restarted and is_running [3], and the standard “use .changed” pattern [4].

[1] (docs.pyinfra.com)
[2] (raw.githubusercontent.com)
[3] (raw.githubusercontent.com)
[4] (stackoverflow.com)

Citations:


🌐 Web query:

pyinfra restarted=True non-idempotent restart service every run

💡 Result:

In pyinfra, restarted=True on server.service(...) / systemd.service(...) means “restart this service” (an action), so it will be executed every deploy run and is therefore non-idempotent by design. The default is restarted=False. [1] [2]

To make it idempotent, restart only when something changed by wiring the service operation to the change status of the operation(s) that update config/files:

from pyinfra.operations import files, systemd

cfg = files.template(
    name="Render app config",
    src="templates/myapp.conf.j2",
    dest="/etc/myapp/myapp.conf",
)

systemd.service(
    name="Restart myapp only if config changed",
    service="myapp.service",
    running=True,
    restarted=True,
    _if=cfg.did_change,   # only restart when template actually changed
)

pyinfra exposes OperationMeta.did_change() and the _if=... global argument specifically for this kind of change-gated execution. [3]

If you don’t want any “restart each run” behavior at all, just remove restarted=True and keep running=True (and enabled=True if desired). [1] [2]

Sources
[1] https://docs.pyinfra.com/en/2.x/operations/server.html
[2] https://docs.pyinfra.com/en/2.x/operations/systemd.html
[3] https://docs.pyinfra.com/en/3.x/using-operations.html


Remove unconditional service restarts or gate restart to config changes.

restarted=True executes on every pyinfra run while the service is running, making this non-idempotent and dropping active MTProto connections on each apply even when the unit and rendered config are unchanged. Either remove restarted=True (keeping only running=True and enabled=True), or conditionally restart by adding _if=service_template.did_change to match the daemon_reload pattern above it.

Also applies to: 294-301

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/runes/mtproto.py` around lines 222 - 229, The systemd.service
invocation currently forces a restart on every run via restarted=True; remove
restarted=True (leaving running=True and enabled=True) or gate the restart to
config changes by adding _if=service_template.did_change so the service only
restarts when the unit/template changed; update the systemd.service call(s) (the
one with name="Enable and start mtg service" and the similar block at lines
294-301) to use one of these two approaches to make the operation idempotent.

Comment thread nullforge/runes/users.py
Comment on lines +36 to +40
if user_created.did_succeed:
if user_opts.password:
_set_user_password(user_opts)
else:
_configure_passwordless_sudo(user_opts)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the file structure
find . -type f -name "*.py" | grep -E "(users|mold)" | head -20

Repository: wlix13/NullForge

Length of output: 571


🏁 Script executed:

# Read the users.py file to understand the full context
cat -n nullforge/runes/users.py

Repository: wlix13/NullForge

Length of output: 4743


🏁 Script executed:

# Find the _configure_passwordless_sudo implementation
rg -n "_configure_passwordless_sudo" --type py -A 10

Repository: wlix13/NullForge

Length of output: 1412


🏁 Script executed:

# Check how user_opts is structured
rg -n "user_opts\|UserOptions\|class.*User" nullforge/runes/users.py

Repository: wlix13/NullForge

Length of output: 42


🏁 Script executed:

# Check the molds/user.py file mentioned in the review
cat -n nullforge/molds/user.py

Repository: wlix13/NullForge

Length of output: 2156


🌐 Web query:

pyinfra did_succeed callback execution timing Python if statement

💡 Result:

In pyinfra, did_succeed only becomes meaningful after the operation has executed. If you call it in a normal Python if while your deploy file is being loaded/prepared, it will usually raise because the operation isn’t complete yet. The implementation explicitly raises "Cannot evaluate operation result before execution" unless you opt out. [1]

What to do instead

1) Use _if=some_op.did_succeed (recommended for chaining ops)
Pass the callable (don’t call it). pyinfra evaluates _if during the execution phase, right before running the later operation. This is why it works for conditions like did_change/did_succeed. [2]
Internally, pyinfra only evaluates _if when state.is_executing is true. [1]

create_user = server.user(...)

server.shell(
    name="Only run if create_user succeeded",
    commands=["echo ok"],
    _if=create_user.did_succeed,   # note: no ()
)

2) If you really need to query it without raising
Call did_succeed(_raise_if_not_complete=False). Before execution it will return False (because _success is still None), and after execution it will return True/False. [1]

if result.did_succeed(_raise_if_not_complete=False):
    ...

3) If you need “after execution, then run Python logic”
Use python.call (a runtime callback) rather than a top-level if, because pyinfra doesn’t execute operations immediately and runtime values (like stdout/status) aren’t available during prepare. [2]

Sources
[1] pyinfra OperationMeta.did_succeed + _if evaluation in command_generator (source code) (raw.githubusercontent.com)
[2] pyinfra docs: _if=...did_succeed usage + “pyinfra doesn’t immediately execute operations… callbacks at runtime” (docs.pyinfra.com)

Citations:


🌐 Web query:

pyinfra _if parameter operation execution condition

💡 Result:

In pyinfra, operations accept a global argument _if to conditionally execute the operation.

What _if expects

  • A callable that returns True/False (pyinfra will call it at runtime), commonly using an earlier operation’s “meta” condition functions like:
    • some_op.did_change
    • some_op.did_not_change
    • some_op.did_succeed
    • some_op.did_error [1]

Examples

Run only if a previous operation changed something

create_user = server.user(...)

server.shell(
    name="Bootstrap user",
    commands=["..."],
    _if=create_user.did_change,
)

[1]

Require multiple conditions (AND)

server.shell(
    commands=["echo both changed"],
    _if=[op1.did_change, op2.did_change],
)

[1]

Custom logic (OR)

server.shell(
    commands=["echo op1 or op2 changed"],
    _if=lambda: op1.did_change() or op2.did_change(),
)

[1]

Helpers

from pyinfra.operations.util import any_changed, all_changed

server.shell(commands=["..."], _if=any_changed(op1, op2))
server.shell(commands=["..."], _if=all_changed(op1, op2))

[1]

Source: pyinfra docs (“Using Operations” → “Change Detection”). [1]


Use _if for the post-create steps, and only write sudoers when sudo=True.

did_succeed is a callback evaluated during the execution phase, not available during deploy file preparation. In a normal Python if statement, this will raise "Cannot evaluate operation result before execution" or silently return False. Use the _if parameter instead, passing the callable without invoking it:

user_created = server.user(...)

server.shell(
    name="Set password for user",
    commands=[...],
    _if=user_created.did_succeed,
)

Additionally, _configure_passwordless_sudo() is called unconditionally when password is falsy (line 40), regardless of the user_opts.sudo setting. This grants NOPASSWD:ALL privileges even when the user should not have sudo access. Guard the passwordless sudo configuration on user_opts.sudo:

if user_created.did_succeed:
    if user_opts.password:
        _set_user_password(user_opts)
    elif user_opts.sudo:
        _configure_passwordless_sudo(user_opts)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/runes/users.py` around lines 36 - 40, The current post-create logic
incorrectly uses a direct Python if on user_created.did_succeed (which is a
runtime callback) and unconditionally configures passwordless sudo when
user_opts.password is falsy; change the post-create steps to attach operations
using the resource/_action `_if` parameter (pass the callable
user_created.did_succeed without invoking it) — e.g., move the password-setting
and sudo-configuration into server.shell/server.user calls that use
_if=user_created.did_succeed — and only call/_run _configure_passwordless_sudo
when user_opts.sudo is truthy (i.e., use an elif or guard so
_configure_passwordless_sudo(user_opts) runs only if user_opts.sudo is true).

Comment thread nullforge/runes/users.py
Comment on lines +57 to +63
escaped_username = opts.name.replace("'", "'\\''")
escaped_password = opts.password.replace("'", "'\\''") # ty:ignore[unresolved-attribute]
server.shell(
name=f"Set password for user {opts.name}",
commands=[
f"printf '%s:%s\\n' '{escaped_username}' '{escaped_password}' | chpasswd",
],
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's look at the actual function context
cd nullforge && git ls-files | head -20

Repository: wlix13/NullForge

Length of output: 415


🏁 Script executed:

# Look for the file in question
find . -name "users.py" -type f

Repository: wlix13/NullForge

Length of output: 114


🏁 Script executed:

# Read the relevant file
wc -l nullforge/runes/users.py

Repository: wlix13/NullForge

Length of output: 88


🏁 Script executed:

# Check the function context around lines 57-63
sed -n '45,75p' nullforge/runes/users.py

Repository: wlix13/NullForge

Length of output: 992


🏁 Script executed:

# Check imports to see what server is
head -20 nullforge/runes/users.py

Repository: wlix13/NullForge

Length of output: 553


🌐 Web query:

pyinfra server.shell stateless idempotent behavior

💡 Result:

In pyinfra, server.shell is explicitly a stateless operation: it always executes the provided command(s) and therefore is not idempotent (re-running the deploy will run them again, even if “nothing needs changing”).[1]

This contrasts with most pyinfra operations, which are designed to be declarative/idempotent (they diff/check state and only apply changes when needed).[2][3]

Implications / how to handle it

  • Prefer a dedicated idempotent operation (eg files.*, apt.*, systemd.*) when possible.[2][3]
  • If you must use server.shell, make the shell itself safe to re-run (eg test -f ... || do_the_thing, “create-if-missing”, etc.), and/or gate execution with pyinfra’s runtime condition mechanism like the _if global argument.[2]

Example (guarded shell)

from pyinfra.operations import server

server.shell(
    name="Create marker once",
    commands=["test -f /opt/.done || (do_work && touch /opt/.done)"],
)

Sources
[1] https://docs.pyinfra.com/en/3.x/operations/server.html
[2] https://docs.pyinfra.com/en/3.x/using-operations.html
[3] https://pyinfra.com/


🌐 Web query:

pyinfra server.user password parameter documentation

💡 Result:

In pyinfra, the password parameter on server.user is for setting the user’s encrypted (hashed) system password, not an SSH password.

  • Docs (operation signature + description): server.user(..., password=None) where password is “set the encrypted password for the user”. [1]
  • This argument was added in pyinfra v2.9 (“Add password argument to server.user operation”). [2]

If what you meant is the password used to connect over SSH, that’s inventory/connector config (ssh_password), documented under the SSH connector. [3]

Sources: [1] pyinfra “Server Operations → server.user” docs, [2] pyinfra v2.9 release notes, [3] pyinfra SSH connector docs.


🏁 Script executed:

# Check how _set_user_password is called and where opts.password comes from
grep -n "_set_user_password\|opts.password" nullforge/runes/users.py

Repository: wlix13/NullForge

Length of output: 286


🏁 Script executed:

# Check pyinfra version requirement
find . -name "*.txt" -o -name "*.py" -o -name "*.toml" | xargs grep -l "pyinfra" 2>/dev/null | head -5

Repository: wlix13/NullForge

Length of output: 216


🏁 Script executed:

# Check requirements/dependencies
if [ -f requirements.txt ]; then cat requirements.txt | grep -i pyinfra; fi
if [ -f setup.py ]; then grep -A 20 "install_requires" setup.py | grep -i pyinfra; fi
if [ -f pyproject.toml ]; then cat pyproject.toml | grep -A 5 -B 5 pyinfra; fi

Repository: wlix13/NullForge

Length of output: 238


🏁 Script executed:

# See if server.user is used elsewhere with password parameter
rg "server\.user.*password" nullforge/

Repository: wlix13/NullForge

Length of output: 42


🏁 Script executed:

# Check the UserMold model to understand what password is
cat nullforge/models/users.py | head -60

Repository: wlix13/NullForge

Length of output: 224


🏁 Script executed:

# Check if there's any password hashing logic in the codebase
rg "hash|crypt|encrypt" nullforge/ --type py

Repository: wlix13/NullForge

Length of output: 278


🏁 Script executed:

# Check the full UserMold definition to understand password field
cat nullforge/models/users.py

Repository: wlix13/NullForge

Length of output: 224


🏁 Script executed:

# Look for how UserMold is instantiated/configured
rg "UserMold\|users:" nullforge/ --type py -A 3 | head -40

Repository: wlix13/NullForge

Length of output: 42


🏁 Script executed:

# Check if there's any documentation or comments about password format expected
rg -i "password.*hash|hash.*password|plaintext.*password" nullforge/ --type py

Repository: wlix13/NullForge

Length of output: 42


🏁 Script executed:

# Get the full models/users.py file
cat nullforge/models/users.py

Repository: wlix13/NullForge

Length of output: 224


🏁 Script executed:

# Look for where users configuration is defined
rg "class UserMold|class FeaturesMold" nullforge/ --type py -A 10

Repository: wlix13/NullForge

Length of output: 1452


🏁 Script executed:

# Check the entire deploy_user_management function to see context
sed -n '1,50p' nullforge/runes/users.py

Repository: wlix13/NullForge

Length of output: 1383


🏁 Script executed:

# Get the full UserMold class definition
cat nullforge/molds/user.py

Repository: wlix13/NullForge

Length of output: 1736


🏁 Script executed:

# Double-check the exact pyinfra version or constraints
cat pyproject.toml | grep -A 5 -B 5 dependencies

Repository: wlix13/NullForge

Length of output: 271


🏁 Script executed:

# Check if there's any password hashing being done elsewhere in the codebase
rg "password" nullforge/runes/users.py -B 2 -A 2

Repository: wlix13/NullForge

Length of output: 1017


🏁 Script executed:

# Look for any use of crypt or similar password hashing
rg "import.*crypt|from.*crypt|hashlib" nullforge/

Repository: wlix13/NullForge

Length of output: 182


🏁 Script executed:

# Check if pyinfra is pinned to a specific version
cat pyproject.toml | grep -i pyinfra

Repository: wlix13/NullForge

Length of output: 72


🏁 Script executed:

# Final check: confirm server.user is actually available in current codebase usage
rg "server\.user" nullforge/runes/users.py -B 1 -A 5

Repository: wlix13/NullForge

Length of output: 652


Use server.user(password=...) with an encrypted password instead of server.shell with chpasswd.

The current approach has two problems: server.shell is stateless and executes on every deploy (non-idempotent), and it embeds the plaintext password directly in the command. Pyinfra's server.user operation accepts an password parameter for encrypted passwords and is declarative/idempotent. To migrate, the plaintext password from UserMold.password should be hashed (e.g., using crypt.crypt()) before being passed to server.user().

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/runes/users.py` around lines 57 - 63, The current code uses
server.shell with chpasswd and embeds plaintext (see
escaped_username/escaped_password and server.shell call), which is
non-idempotent and insecure; instead compute an encrypted password from
UserMold.password (e.g., use crypt.crypt() with a salt) and replace the
server.shell invocation with the declarative Pyinfra operation server.user(...,
name=opts.name, password=<encrypted_password>), ensuring you pass the encrypted
value to server.user so the operation is idempotent and no plaintext password is
embedded.

Comment thread nullforge/runes/users.py
Comment on lines +89 to +93
username = opts.name
current_user_home = host.get_fact(Home)
new_user_home = host.get_fact(Home, user=username)

ssh_keys = host.get_fact(FileContents, f"{current_user_home}/.ssh/authorized_keys") or []
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the file structure and locate the relevant files
find . -name "users.py" -o -name "user.py" | head -20

Repository: wlix13/NullForge

Length of output: 140


🏁 Script executed:

# Read the context around lines 89-93 in nullforge/runes/users.py
wc -l nullforge/runes/users.py

Repository: wlix13/NullForge

Length of output: 88


🏁 Script executed:

# Read the relevant section with more context
head -100 nullforge/runes/users.py | tail -20

Repository: wlix13/NullForge

Length of output: 667


🏁 Script executed:

# Get the full context of the _copy_ssh_keys function
cat -n nullforge/runes/users.py | sed -n '75,105p'

Repository: wlix13/NullForge

Length of output: 1220


🏁 Script executed:

# Find the copy_root_keys definition
find . -name "*.py" -type f -exec grep -l "copy_root_keys" {} \;

Repository: wlix13/NullForge

Length of output: 112


🏁 Script executed:

# Search for the server.user call to understand home parameter
rg "server\.user" nullforge/ -A 3 -B 1

Repository: wlix13/NullForge

Length of output: 1816


🏁 Script executed:

# Check how Home fact is used in context
rg "get_fact.*Home" nullforge/ -B 2 -A 2

Repository: wlix13/NullForge

Length of output: 429


🏁 Script executed:

# Read nullforge/molds/user.py to find copy_root_keys
cat -n nullforge/molds/user.py

Repository: wlix13/NullForge

Length of output: 2156


🏁 Script executed:

# Get broader context around user creation in nullforge/runes/users.py
cat -n nullforge/runes/users.py | sed -n '1,120p'

Repository: wlix13/NullForge

Length of output: 4288


🏁 Script executed:

# Search for where _copy_ssh_keys is called
rg "_copy_ssh_keys" nullforge/ -B 3 -A 3

Repository: wlix13/NullForge

Length of output: 730


🌐 Web query:

pyinfra facts timing when are facts gathered execution order

💡 Result:

When pyinfra gathers facts (timing)

  • Deploy code runs first, before any changes are executed. This “prepare / ordering” step is what lets pyinfra determine a consistent operation order across many hosts. If you call host.get_fact(...) in deploy code (eg inside if ...:), that fact is evaluated during this pre-execution phase. [1]
  • In pyinfra v3 docs, facts are “populated at the beginning of the deploy” and are treated as a read-only snapshot used to diff current vs desired state. [2]
  • A fact is implemented by running its command on the target host and processing the output into Python data. [3]
  • Facts are not automatically refreshed mid-deploy. The docs explicitly warn that if a command/operation changes something a fact would report, “the fact would not be updated since facts are only run at the start of a deploy.” [4]

Execution order (what runs when)

  • High-level lifecycle is: setup → connect → prepare (detect changes + determine operation order) → execute → disconnect/cleanup. [1]
  • Operations are ordered first, then executed. Conceptually, pyinfra aims to run operations sequentially by operation, while running each operation across applicable hosts in parallel (so all hosts finish operation N before moving to operation N+1). [1]

Practical implication

If you branch in deploy code based on a fact that could change due to earlier operations in the same deploy, you can get surprising results because that fact was read from the initial snapshot. Prefer expressing desired state with operations (and using runtime guards like _if where appropriate) rather than if host.get_fact(...) on mutable state. [1][4]

Sources: [1] [2] [3] [4]


🌐 Web query:

pyinfra Home fact user parameter documentation

💡 Result:

In pyinfra, the Home fact is server.Home, and it accepts a user parameter to return that user’s home directory.

  • Python usage (via host.get_fact):
    host.get_fact(Home, user='') → returns the home directory for the given user; if user is empty/not provided, it returns the current user’s home directory. [1]

  • CLI usage (passing fact arguments as key=value):
    pyinfra myhost fact server.Home user=someuser
    (pyinfra facts accept arguments in the CLI as key=value.) [2]

Docs:

  • server.Home fact reference (shows the user parameter): [1]
  • Facts CLI argument format (key=value): [2]

References:
[1] https://docs.pyinfra.com/en/3.x/facts/server.html
[2] https://docs.pyinfra.com/en/3.x/facts.html


🏁 Script executed:

# Check if there's any documentation or comments about pyinfra in the codebase
rg "pyinfra" nullforge/ -i | head -20

Repository: wlix13/NullForge

Length of output: 1481


Use user="root" for the source home and hard-code the destination home path to avoid unreliable fact resolution.

copy_root_keys is intended to copy the root user's SSH keys, but line 90 queries the current remote user's home instead of root's. Additionally, pyinfra gathers facts before any operations execute; host.get_fact(Home, user=username) on line 91 will read from the pre-deploy snapshot and won't reflect the user just created by server.user() on line 26. Replace line 90 with host.get_fact(Home, user="root") and hard-code the destination path (e.g., f"/home/{username}") instead of another Home fact.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/runes/users.py` around lines 89 - 93, In copy_root_keys, don't use
the current remote user's Home fact or a pre-deploy Home fact for the new user;
change the source home lookup to host.get_fact(Home, user="root") (replace
current_user_home assignment) and stop using host.get_fact(Home, user=username)
for the destination; instead hard-code the destination home path as
f"/home/{username}" (assign to new_user_home) and compute ssh_keys from the root
source path (ssh_keys should read from the root home’s .ssh/authorized_keys),
keeping references to username, current_user_home, new_user_home, ssh_keys and
host.get_fact(Home, ...) to locate the edits.

Comment on lines +84 to +90
def telemt_tar(self) -> str:
"""telemt tarball URL (latest release from GitHub)."""

arch = arch_id(self.host)
uname_arch = {"x86_64": "x86_64", "arm64": "aarch64"}.get(arch, arch)

return f"https://github.com/telemt/telemt/releases/latest/download/telemt-{uname_arch}-linux-gnu.tar.gz"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Reject unsupported TeleMT architectures here instead of emitting a broken URL.

Unlike the other URL helpers, this one falls back to the raw arch_id() output. nullforge/smithy/arch.py, Line 25 returns unknown values unchanged, so this builds a bogus release URL and the failure only surfaces later in nullforge/runes/mtproto.py, Lines 238-243 during download.

🩹 Proposed fix
     def telemt_tar(self) -> str:
         """telemt tarball URL (latest release from GitHub)."""

         arch = arch_id(self.host)
-        uname_arch = {"x86_64": "x86_64", "arm64": "aarch64"}.get(arch, arch)
+        match arch:
+            case "x86_64":
+                uname_arch = "x86_64"
+            case "arm64":
+                uname_arch = "aarch64"
+            case _:
+                raise ValueError(f"Unsupported architecture: {arch}")

         return f"https://github.com/telemt/telemt/releases/latest/download/telemt-{uname_arch}-linux-gnu.tar.gz"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/smithy/versions.py` around lines 84 - 90, The telemt_tar helper
currently falls back to raw arch_id() output and can emit a bogus URL; update
telemt_tar to only accept known arches and reject others: call
arch_id(self.host), map it via the existing uname_arch mapping, and if the arch
isn't present, raise a clear exception (e.g., ValueError or a custom error)
instead of building the URL; reference the telemt_tar method and arch_id helper
so the error surfaces immediately where telemt_tar is called rather than later
during download.

Comment on lines +105 to +116
def eza_tar(self) -> str:
"""eza tarball URL."""

base_url = "https://github.com/eza-community/eza/releases/latest/download"
arch = arch_id(self.host)
match arch:
case "x86_64":
return f"{base_url}/eza_x86_64-unknown-linux-gnu.tar.gz"
case "arm64":
return f"{base_url}/eza_aarch64-unknown-linux-gnu.tar.gz"
case _:
raise ValueError(f"Unsupported architecture: {arch}")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Honor the configured eza version.

DEFAULT_VERSIONS advertises per-host overrides, but eza_tar() always downloads releases/latest. Any host.data["versions"]["eza"] value is ignored, and fresh installs can drift as upstream publishes new releases.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/smithy/versions.py` around lines 105 - 116, eza_tar() currently
always uses "releases/latest" and ignores any configured version in
host.data["versions"]["eza"]; change eza_tar to check host.data.get("versions",
{}).get("eza") (or DEFAULT_VERSIONS fallback) and if present construct the
download URL using that exact version/tag (e.g. use
"{base_url}/download/{version}/..." or the appropriate GitHub release asset
path) and only fall back to "releases/latest" when no version is configured;
update the function (referencing eza_tar and arch_id) to respect and use the
configured version string for the generated tarball URL and raise the same
ValueError for unsupported arch.

Comment on lines +20 to +22
[access.users]
{% for username, secret in USERS.items() %}{{ username }} = "{{ secret }}"
{% endfor %}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Render TeleMT usernames as quoted TOML keys.

username is emitted as a bare key here, so values like alice.dev, alice@example.com, or alice smith will either change the parsed structure or make the file invalid. nullforge/molds/mtproto.py, Lines 35-38 currently allows arbitrary string keys, so the template should not assume TOML bare-key syntax.

🩹 Proposed fix
 [access.users]
-{% for username, secret in USERS.items() %}{{ username }} = "{{ secret }}"
+{% for username, secret in USERS.items() %}"{{ username }}" = "{{ secret }}"
 {% endfor %}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
[access.users]
{% for username, secret in USERS.items() %}{{ username }} = "{{ secret }}"
{% endfor %}
[access.users]
{% for username, secret in USERS.items() %}"{{ username }}" = "{{ secret }}"
{% endfor %}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/templates/telemt/telemt.toml.j2` around lines 20 - 22, The template
emits usernames as bare TOML keys which breaks for characters like dots, spaces,
or @; update telemt.toml.j2 where the USERS loop iterates (the {% for username,
secret in USERS.items() %} block) to render the key as a quoted TOML string (and
escape any embedded quotes) instead of a bare key — e.g., output a quoted key
around {{ username }} (with proper escaping) so any arbitrary string from USERS
is a valid TOML key.

@wlix13 wlix13 force-pushed the feature/initial-version branch from 071baed to cb67837 Compare April 8, 2026 19:19
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (4)
nullforge/smithy/versions.py (2)

84-90: ⚠️ Potential issue | 🟠 Major

Reject unsupported TeleMT architectures here.

arch_id() returns unknown architecture facts unchanged, so Lines 87-90 are currently the only architecture-specific helper that can silently build a bogus asset URL instead of failing fast like the rest of this class. Raise the same ValueError here before _install_telemt() tries to download it.

🩹 Suggested fix
     def telemt_tar(self) -> str:
         """telemt tarball URL (latest release from GitHub)."""
 
         arch = arch_id(self.host)
-        uname_arch = {"x86_64": "x86_64", "arm64": "aarch64"}.get(arch, arch)
+        uname_arch = {"x86_64": "x86_64", "arm64": "aarch64"}.get(arch)
+        if uname_arch is None:
+            raise ValueError(f"Unsupported architecture: {arch}")
 
         return f"https://github.com/telemt/telemt/releases/latest/download/telemt-{uname_arch}-linux-gnu.tar.gz"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/smithy/versions.py` around lines 84 - 90, The telemt_tar method
currently builds a URL for any arch returned by arch_id(self.host) which can
produce invalid URLs; modify telemt_tar to validate the architecture and raise
ValueError for unsupported values (matching the class’s behavior elsewhere)
before returning the URL—use the same mapping logic in telemt_tar (uname_arch =
{"x86_64":"x86_64","arm64":"aarch64"}) and if arch not in that mapping raise
ValueError with a clear message so callers like _install_telemt() fail fast
instead of attempting to download a bogus asset.

105-116: ⚠️ Potential issue | 🟠 Major

Honor the configured eza version.

This helper always points at releases/latest/download, so host.data["versions"]["eza"] can never affect the artifact. That breaks per-host overrides and makes fresh installs drift from the declared version map.

🩹 Suggested fix
     def eza_tar(self) -> str:
         """eza tarball URL."""
 
-        base_url = "https://github.com/eza-community/eza/releases/latest/download"
+        version = self.versions["eza"]
+        base_url = (
+            "https://github.com/eza-community/eza/releases/latest/download"
+            if version == "latest"
+            else f"https://github.com/eza-community/eza/releases/download/{version}"
+        )
         arch = arch_id(self.host)
         match arch:
             case "x86_64":
                 return f"{base_url}/eza_x86_64-unknown-linux-gnu.tar.gz"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/smithy/versions.py` around lines 105 - 116, The eza_tar helper
always uses "releases/latest/download" so per-host overrides are ignored; change
eza_tar to read the configured version from self.host.data["versions"]["eza"]
(fallback to "latest" only if absent) and build the GitHub releases URL using
that version (i.e.
"https://github.com/eza-community/eza/releases/download/{version}") instead of
hardcoding "latest", keeping the existing arch logic in eza_tar to pick the
correct filename for "x86_64" and "arm64" and raising on unsupported
architectures.
nullforge/runes/mtproto.py (2)

154-158: ⚠️ Potential issue | 🟠 Major

Don't treat “file exists” as “state is current.”

Lines 157-158 and 235-236 short-circuit as soon as the target path exists, so pin changes and per-host overrides never reconcile on existing hosts. For TeleMT, that also means the deploy has no stable way to converge because the download helper still targets releases/latest.

Also applies to: 232-239

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/runes/mtproto.py` around lines 154 - 158, The early return in
_install_mtg that stops when host.get_fact(File, MTG_EXEC) is truthy incorrectly
treats "file exists" as "correct state"; remove that short-circuit and instead
verify the installed mtg matches the desired release (use the MtprotoMold opts
version/checksum) before skipping. Concretely, in _install_mtg and the similar
block around lines referencing MTG_EXEC (also applies to the other install path
at ~232-239), replace the existence check with a validation step that reads the
installed binary/version or checksum and only skips download if it matches
opts.version (or opts.checksum); otherwise download the correct release (avoid
always hitting `releases/latest`) and atomically install/replace the binary.
Ensure you reference MTG_EXEC, _install_mtg, and the download helper when making
the change.

222-229: ⚠️ Potential issue | 🟠 Major

Make the service restart conditional.

In pyinfra, systemd.service(..., restarted=True) issues a restart whenever the unit is already running, so these two blocks bounce the proxy on every apply instead of only when the config or unit changed. Keep running=True/enabled=True, but gate the restart on the relevant template operations changing instead of forcing it every run. (docs.pyinfra.com)

Also applies to: 294-301

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/runes/mtproto.py` around lines 222 - 229, Remove the unconditional
restarted=True from the systemd.service(...) calls (keep running=True and
enabled=True) and instead trigger the restart only when the related config/unit
templates change by attaching a restart operation as an on_change/on_changed
handler to the template operations that render the unit/config (e.g., call
systemd.service(..., restarted=True) via the template's on_change/on_changed
argument or equivalent callback). Update both occurrences currently using
restarted=True (the systemd.service calls shown and the duplicate at ~294-301)
so the service is only restarted when the template or unit actually changes.
🧹 Nitpick comments (2)
nullforge/molds/features.py (2)

18-31: Docstring placement is unconventional.

The string literal on line 31 is a standalone expression, not associated with ALLOWED_FEATURES_LAYERS by documentation tools. Consider using a comment above the constant instead.

📝 Suggested fix
+# Allowed features layers for the FeaturesMold.
 ALLOWED_FEATURES_LAYERS = (
     ContainersMold,
     DnsMold,
     HaproxyMold,
     MtprotoMold,
     NetSecMold,
     ProfilesMold,
     TorMold,
     UserMold,
     WarpMold,
     XrayCoreMold,
     ZeroTrustTunnelMold,
 )
-"""Allowed features layers for the FeaturesMold."""
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/molds/features.py` around lines 18 - 31, The trailing triple-quoted
string after ALLOWED_FEATURES_LAYERS is a standalone expression and not a
docstring; replace it with a normal inline or block comment placed above the
ALLOWED_FEATURES_LAYERS constant so documentation tools pick it up (update the
comment to mention ALLOWED_FEATURES_LAYERS and/or FeaturesMold to keep context).
Locate ALLOWED_FEATURES_LAYERS in nullforge/molds/features.py and remove the
standalone string literal, then add a concise comment immediately above the
constant definition describing its purpose.

37-47: Consider using default_factory for mutable defaults.

While Pydantic v2 safely copies mutable defaults, using default_factory is the idiomatic pattern that avoids instantiating objects at class definition time. This aligns with SystemMold.swap which uses default_factory=SwapMold (per context snippet 2).

♻️ Suggested pattern
-    warp: WarpMold = Field(default=WarpMold())
-    zerotrust: ZeroTrustTunnelMold = Field(default=ZeroTrustTunnelMold())
-    containers: ContainersMold = Field(default=ContainersMold())
+    warp: WarpMold = Field(default_factory=WarpMold)
+    zerotrust: ZeroTrustTunnelMold = Field(default_factory=ZeroTrustTunnelMold)
+    containers: ContainersMold = Field(default_factory=ContainersMold)
     # ... same pattern for remaining fields
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/molds/features.py` around lines 37 - 47, Replace the current
Field(default=...) usage for mutable feature fields (warp, zerotrust,
containers, dns, haproxy, mtproto, netsec, profiles, tor, users, xray) in
nullforge/molds/features.py with Field(default_factory=...) to avoid
instantiating those molds at class definition time (use the corresponding class
names WarpMold, ZeroTrustTunnelMold, ContainersMold, DnsMold, HaproxyMold,
MtprotoMold, NetSecMold, ProfilesMold, TorMold, UserMold, XrayCoreMold as the
factories), mirroring the pattern used by SystemMold.swap which uses
default_factory=SwapMold.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@nullforge/runes/mtproto.py`:
- Around line 264-275: The template variable USE_MIDDLE_PROXY passed from the
files.template call (in the "Deploy telemt config" block that uses
get_telemt_template) is currently unused, so toggling
MtprotoMold.use_middle_proxy has no effect; update
nullforge/templates/telemt/telemt.toml.j2 to consume that variable (reference
USE_MIDDLE_PROXY) and conditionally render the middle-proxy-related config (or
set a boolean/key in the file) using Jinja2 conditionals so the rendered
telemt.toml reflects MtprotoMold.use_middle_proxy, ensuring the variable name
matches the one passed from the files.template call.
- Around line 33-43: Before tearing down the other provider, validate the
selected provider's configuration (the MtprotoMold fields like secret,
tls_domain, and users) and abort with a clear error if required settings are
missing/invalid; move or add validation checks immediately after reading
opts.provider and before calling _cleanup_telemt or _cleanup_mtg. Specifically,
when opts.provider == MtprotoProvider.MTG, verify MtprotoMold.secret and
MtprotoMold.tls_domain are non-empty and users is non-empty (or otherwise valid)
before calling _cleanup_telemt and then proceed with _ensure_user, _install_mtg,
_deploy_mtg_service; similarly, when opts.provider == MtprotoProvider.TELEMT
validate TELEMT-specific settings before calling _cleanup_mtg and before running
_ensure_user, _install_telemt, _deploy_telemt_service so you never tear down a
working proxy due to an invalid replacement config.

---

Duplicate comments:
In `@nullforge/runes/mtproto.py`:
- Around line 154-158: The early return in _install_mtg that stops when
host.get_fact(File, MTG_EXEC) is truthy incorrectly treats "file exists" as
"correct state"; remove that short-circuit and instead verify the installed mtg
matches the desired release (use the MtprotoMold opts version/checksum) before
skipping. Concretely, in _install_mtg and the similar block around lines
referencing MTG_EXEC (also applies to the other install path at ~232-239),
replace the existence check with a validation step that reads the installed
binary/version or checksum and only skips download if it matches opts.version
(or opts.checksum); otherwise download the correct release (avoid always hitting
`releases/latest`) and atomically install/replace the binary. Ensure you
reference MTG_EXEC, _install_mtg, and the download helper when making the
change.
- Around line 222-229: Remove the unconditional restarted=True from the
systemd.service(...) calls (keep running=True and enabled=True) and instead
trigger the restart only when the related config/unit templates change by
attaching a restart operation as an on_change/on_changed handler to the template
operations that render the unit/config (e.g., call systemd.service(...,
restarted=True) via the template's on_change/on_changed argument or equivalent
callback). Update both occurrences currently using restarted=True (the
systemd.service calls shown and the duplicate at ~294-301) so the service is
only restarted when the template or unit actually changes.

In `@nullforge/smithy/versions.py`:
- Around line 84-90: The telemt_tar method currently builds a URL for any arch
returned by arch_id(self.host) which can produce invalid URLs; modify telemt_tar
to validate the architecture and raise ValueError for unsupported values
(matching the class’s behavior elsewhere) before returning the URL—use the same
mapping logic in telemt_tar (uname_arch = {"x86_64":"x86_64","arm64":"aarch64"})
and if arch not in that mapping raise ValueError with a clear message so callers
like _install_telemt() fail fast instead of attempting to download a bogus
asset.
- Around line 105-116: The eza_tar helper always uses "releases/latest/download"
so per-host overrides are ignored; change eza_tar to read the configured version
from self.host.data["versions"]["eza"] (fallback to "latest" only if absent) and
build the GitHub releases URL using that version (i.e.
"https://github.com/eza-community/eza/releases/download/{version}") instead of
hardcoding "latest", keeping the existing arch logic in eza_tar to pick the
correct filename for "x86_64" and "arm64" and raising on unsupported
architectures.

---

Nitpick comments:
In `@nullforge/molds/features.py`:
- Around line 18-31: The trailing triple-quoted string after
ALLOWED_FEATURES_LAYERS is a standalone expression and not a docstring; replace
it with a normal inline or block comment placed above the
ALLOWED_FEATURES_LAYERS constant so documentation tools pick it up (update the
comment to mention ALLOWED_FEATURES_LAYERS and/or FeaturesMold to keep context).
Locate ALLOWED_FEATURES_LAYERS in nullforge/molds/features.py and remove the
standalone string literal, then add a concise comment immediately above the
constant definition describing its purpose.
- Around line 37-47: Replace the current Field(default=...) usage for mutable
feature fields (warp, zerotrust, containers, dns, haproxy, mtproto, netsec,
profiles, tor, users, xray) in nullforge/molds/features.py with
Field(default_factory=...) to avoid instantiating those molds at class
definition time (use the corresponding class names WarpMold,
ZeroTrustTunnelMold, ContainersMold, DnsMold, HaproxyMold, MtprotoMold,
NetSecMold, ProfilesMold, TorMold, UserMold, XrayCoreMold as the factories),
mirroring the pattern used by SystemMold.swap which uses
default_factory=SwapMold.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f45197a0-c9e0-49fc-8c16-d5ed07d10ca8

📥 Commits

Reviewing files that changed from the base of the PR and between 071baed and cb67837.

📒 Files selected for processing (11)
  • nullforge/foundry/full_cast.py
  • nullforge/models/mtproto.py
  • nullforge/molds/__init__.py
  • nullforge/molds/features.py
  • nullforge/molds/mtproto.py
  • nullforge/molds/utils.py
  • nullforge/runes/mtproto.py
  • nullforge/smithy/versions.py
  • nullforge/templates/__init__.py
  • nullforge/templates/systemd/telemt.service.j2
  • nullforge/templates/telemt/telemt.toml.j2
✅ Files skipped from review due to trivial changes (4)
  • nullforge/models/mtproto.py
  • nullforge/templates/telemt/telemt.toml.j2
  • nullforge/templates/systemd/telemt.service.j2
  • nullforge/molds/mtproto.py
🚧 Files skipped from review as they are similar to previous changes (3)
  • nullforge/foundry/full_cast.py
  • nullforge/molds/init.py
  • nullforge/templates/init.py

Comment on lines +33 to +43
match opts.provider:
case MtprotoProvider.MTG:
_cleanup_telemt()
_ensure_user(MTG_USER, MTG_GROUP, MTG_CONFIG_DIR)
_install_mtg(opts)
_deploy_mtg_service(opts)
case MtprotoProvider.TELEMT:
_cleanup_mtg()
_ensure_user(TELEMT_USER, TELEMT_GROUP, TELEMT_CONFIG_DIR)
_install_telemt()
_deploy_telemt_service(opts)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Validate provider-specific settings before cleanup.

MtprotoMold still defaults secret="", tls_domain="", and users={} in nullforge/molds/mtproto.py:10-42, and Lines 35-42 tear down the other provider before checking any of them. A bad inventory entry can remove a working proxy and then render an unusable replacement.

🩹 Suggested guard
     match opts.provider:
         case MtprotoProvider.MTG:
+            if not opts.secret:
+                raise ValueError("features.mtproto.secret is required when provider=mtg")
             _cleanup_telemt()
             _ensure_user(MTG_USER, MTG_GROUP, MTG_CONFIG_DIR)
             _install_mtg(opts)
             _deploy_mtg_service(opts)
         case MtprotoProvider.TELEMT:
+            if not opts.tls_domain or not opts.users:
+                raise ValueError(
+                    "features.mtproto.tls_domain and at least one features.mtproto.users entry are required when provider=telemt"
+                )
             _cleanup_mtg()
             _ensure_user(TELEMT_USER, TELEMT_GROUP, TELEMT_CONFIG_DIR)
             _install_telemt()
             _deploy_telemt_service(opts)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/runes/mtproto.py` around lines 33 - 43, Before tearing down the
other provider, validate the selected provider's configuration (the MtprotoMold
fields like secret, tls_domain, and users) and abort with a clear error if
required settings are missing/invalid; move or add validation checks immediately
after reading opts.provider and before calling _cleanup_telemt or _cleanup_mtg.
Specifically, when opts.provider == MtprotoProvider.MTG, verify
MtprotoMold.secret and MtprotoMold.tls_domain are non-empty and users is
non-empty (or otherwise valid) before calling _cleanup_telemt and then proceed
with _ensure_user, _install_mtg, _deploy_mtg_service; similarly, when
opts.provider == MtprotoProvider.TELEMT validate TELEMT-specific settings before
calling _cleanup_mtg and before running _ensure_user, _install_telemt,
_deploy_telemt_service so you never tear down a working proxy due to an invalid
replacement config.

Comment on lines +264 to +275
files.template(
name="Deploy telemt config",
src=get_telemt_template("telemt.toml.j2"),
dest=config_path,
user=TELEMT_USER,
group=TELEMT_GROUP,
mode="0640",
PORT=opts.port,
TLS_DOMAIN=opts.tls_domain,
USERS=opts.users,
USE_MIDDLE_PROXY=opts.use_middle_proxy,
_sudo=True,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

use_middle_proxy is a no-op right now.

Line 274 passes USE_MIDDLE_PROXY, but nullforge/templates/telemt/telemt.toml.j2:1-31 never references that variable. Toggling MtprotoMold.use_middle_proxy therefore doesn't change the rendered config.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/runes/mtproto.py` around lines 264 - 275, The template variable
USE_MIDDLE_PROXY passed from the files.template call (in the "Deploy telemt
config" block that uses get_telemt_template) is currently unused, so toggling
MtprotoMold.use_middle_proxy has no effect; update
nullforge/templates/telemt/telemt.toml.j2 to consume that variable (reference
USE_MIDDLE_PROXY) and conditionally render the middle-proxy-related config (or
set a boolean/key in the file) using Jinja2 conditionals so the rendered
telemt.toml reflects MtprotoMold.use_middle_proxy, ensuring the variable name
matches the one passed from the files.template call.

@wlix13 wlix13 force-pushed the feature/initial-version branch from cb67837 to 0dfef69 Compare April 8, 2026 20:42
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (5)
nullforge/runes/mtproto.py (2)

154-158: ⚠️ Potential issue | 🟠 Major

Don't treat “binary exists” as “desired artifact installed.”

These guards return before comparing the installed binary with the artifact selected by Versions(host). MTG pin changes—and any future TeleMT artifact changes—never reconcile on already-provisioned machines, so upgrades and downgrades stay manual.

Also applies to: 232-236

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/runes/mtproto.py` around lines 154 - 158, The current guard in
_install_mtg (checking host.get_fact(File, MTG_EXEC)) treats "file exists" as
"correct artifact installed" so changes to Versions(host) never trigger
reinstall; update _install_mtg to compare the installed binary's
version/metadata against the desired artifact from Versions(host) and only
return early if they match: retrieve the desired artifact/version via
Versions(host) (or the same lookup used elsewhere), inspect the existing
MTG_EXEC (e.g., checksum, embedded version, or stored installed-artifact marker)
and if it differs, proceed to download/install the selected release; apply the
same fix to the analogous check in the other block referenced (lines 232-236) so
both installation paths reconcile with Versions(host).

191-202: ⚠️ Potential issue | 🟠 Major

Restart MTProto services only when the config or unit actually changed.

systemd.service(..., restarted=True) here is non-idempotent in pyinfra: once the service is running, every apply will bounce it. That needlessly drops active proxy connections; capture the config template operations and gate the restart on config/service template changes instead of restarting unconditionally.

In pyinfra 3.x, does `systemd.service(..., restarted=True)` restart a running service on every deploy, and what is the recommended way to restart only when a `files.template(...)` operation changed?

Also applies to: 222-229, 264-275, 293-300

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/runes/mtproto.py` around lines 191 - 202, The files.template calls
(e.g., Deploy mtg config using get_mtg_template("config.toml.j2") with vars
SECRET/PORT/ENABLE_IPV6) are currently followed by systemd.service(...,
restarted=True) which unconditionally restarts the unit on every apply; change
the flow to capture the result of files.template (assign to a var like
config_result) and only call systemd.service with restarted=True when that
result indicates a change (i.e., config_result.changed is true), repeating this
pattern for the other template/unit pairs referenced (lines around 222-229,
264-275, 293-300) so the service is restarted idempotently only when the
template or unit actually changed.
nullforge/smithy/versions.py (2)

84-90: ⚠️ Potential issue | 🟠 Major

Reject unsupported TeleMT architectures here instead of emitting a bogus URL.

arch_id() in nullforge/smithy/arch.py, Lines 12-28, returns unknown values unchanged. Line 88 reuses that raw value, so _install_telemt() later gets a fabricated download URL instead of an immediate, clear error for unsupported hosts.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/smithy/versions.py` around lines 84 - 90, telemt_tar currently uses
arch_id(self.host) and falls back to the raw arch value which produces an
invalid URL for unsupported architectures; change telemt_tar to validate the
returned arch_id against the known mapping (e.g., keys "x86_64" and "arm64") and
raise a clear exception (ValueError or a specific UnsupportedArchError) when
arch_id returns an unsupported value so that _install_telemt fails immediately
with a clear error rather than attempting to download a bogus URL.

105-116: ⚠️ Potential issue | 🟠 Major

Honor the configured eza version.

Line 56 merges per-host version overrides, but this helper never reads self.versions["eza"] and always downloads releases/latest. Any declared eza version is ignored, so new hosts drift from inventory as upstream publishes new releases.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/smithy/versions.py` around lines 105 - 116, The eza_tar helper
always uses the GitHub "latest" URL and ignores host configuration; change it to
read the configured version via self.versions.get("eza", "latest") (or similar)
and build the release URL using that version (use "releases/download/{version}"
instead of "releases/latest/download"); keep the existing arch mapping in
eza_tar and substitute the new base URL variable into the returned f-strings in
eza_tar so declared eza versions are honored.
nullforge/molds/mtproto.py (1)

21-38: ⚠️ Potential issue | 🟠 Major

Validate provider-specific MTProto settings in the mold.

Line 21 still leaves port unbounded, and Lines 26-38 still allow provider=mtg with an empty secret or provider=telemt with empty tls_domain / users. The users values are also unconstrained even though the field description says they must be 32 hex chars. Because nullforge/runes/mtproto.py, Lines 33-43, starts tearing down the other provider before these values are used, a bad inventory entry can still turn a healthy host into a failed deploy.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nullforge/molds/mtproto.py` around lines 21 - 38, The mold currently allows
invalid MTProto configs: constrain port to valid TCP range (1–65535) by updating
the port Field, enforce non-empty secret when provider == "mtg" and enforce
non-empty tls_domain and non-empty users when provider == "telemt", and validate
each users value to be exactly 32 hex chars (e.g., via a regex
/^[0-9a-fA-F]{32}$/) instead of unconstrained dict entries; implement these
checks in the mtproto mold using Field constraints and a pydantic
validator/root_validator (referencing the port, secret, tls_domain, users fields
and add a validator like validate_provider) so bad inventory entries are
rejected early.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@nullforge/molds/mtproto.py`:
- Around line 21-38: The mold currently allows invalid MTProto configs:
constrain port to valid TCP range (1–65535) by updating the port Field, enforce
non-empty secret when provider == "mtg" and enforce non-empty tls_domain and
non-empty users when provider == "telemt", and validate each users value to be
exactly 32 hex chars (e.g., via a regex /^[0-9a-fA-F]{32}$/) instead of
unconstrained dict entries; implement these checks in the mtproto mold using
Field constraints and a pydantic validator/root_validator (referencing the port,
secret, tls_domain, users fields and add a validator like validate_provider) so
bad inventory entries are rejected early.

In `@nullforge/runes/mtproto.py`:
- Around line 154-158: The current guard in _install_mtg (checking
host.get_fact(File, MTG_EXEC)) treats "file exists" as "correct artifact
installed" so changes to Versions(host) never trigger reinstall; update
_install_mtg to compare the installed binary's version/metadata against the
desired artifact from Versions(host) and only return early if they match:
retrieve the desired artifact/version via Versions(host) (or the same lookup
used elsewhere), inspect the existing MTG_EXEC (e.g., checksum, embedded
version, or stored installed-artifact marker) and if it differs, proceed to
download/install the selected release; apply the same fix to the analogous check
in the other block referenced (lines 232-236) so both installation paths
reconcile with Versions(host).
- Around line 191-202: The files.template calls (e.g., Deploy mtg config using
get_mtg_template("config.toml.j2") with vars SECRET/PORT/ENABLE_IPV6) are
currently followed by systemd.service(..., restarted=True) which unconditionally
restarts the unit on every apply; change the flow to capture the result of
files.template (assign to a var like config_result) and only call
systemd.service with restarted=True when that result indicates a change (i.e.,
config_result.changed is true), repeating this pattern for the other
template/unit pairs referenced (lines around 222-229, 264-275, 293-300) so the
service is restarted idempotently only when the template or unit actually
changed.

In `@nullforge/smithy/versions.py`:
- Around line 84-90: telemt_tar currently uses arch_id(self.host) and falls back
to the raw arch value which produces an invalid URL for unsupported
architectures; change telemt_tar to validate the returned arch_id against the
known mapping (e.g., keys "x86_64" and "arm64") and raise a clear exception
(ValueError or a specific UnsupportedArchError) when arch_id returns an
unsupported value so that _install_telemt fails immediately with a clear error
rather than attempting to download a bogus URL.
- Around line 105-116: The eza_tar helper always uses the GitHub "latest" URL
and ignores host configuration; change it to read the configured version via
self.versions.get("eza", "latest") (or similar) and build the release URL using
that version (use "releases/download/{version}" instead of
"releases/latest/download"); keep the existing arch mapping in eza_tar and
substitute the new base URL variable into the returned f-strings in eza_tar so
declared eza versions are honored.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 193a3dd2-68a1-44d5-a166-386311c92d97

📥 Commits

Reviewing files that changed from the base of the PR and between cb67837 and 0dfef69.

📒 Files selected for processing (11)
  • nullforge/foundry/full_cast.py
  • nullforge/models/mtproto.py
  • nullforge/molds/__init__.py
  • nullforge/molds/features.py
  • nullforge/molds/mtproto.py
  • nullforge/molds/utils.py
  • nullforge/runes/mtproto.py
  • nullforge/smithy/versions.py
  • nullforge/templates/__init__.py
  • nullforge/templates/systemd/telemt.service.j2
  • nullforge/templates/telemt/telemt.toml.j2
✅ Files skipped from review due to trivial changes (4)
  • nullforge/models/mtproto.py
  • nullforge/templates/systemd/telemt.service.j2
  • nullforge/templates/telemt/telemt.toml.j2
  • nullforge/templates/init.py
🚧 Files skipped from review as they are similar to previous changes (4)
  • nullforge/molds/init.py
  • nullforge/molds/features.py
  • nullforge/foundry/full_cast.py
  • nullforge/molds/utils.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature New feature or request needs review Pull request needs code review priority: high High priority issue or PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant