Skip to content

Installer reports failure on macOS when WeasyPrint deps missing (false positive masks successful install) #29

@localevolution

Description

@localevolution

Environment

  • macOS 15.5 (Darwin 25.5.0, arm64)
  • Python 3.14.5
  • Installer ref: v1.9.6-codex.5 (default)
  • System libs glib/pango/cairo not present (typical fresh Mac)

Symptom

bash install.sh exits non-zero with a NoneType traceback from the diagnostic helper:

[INFO] Bootstrapping Python runtime...
[ERROR] Codex SEO runtime bootstrap failed.
Traceback (most recent call last):
  File "<string>", line 10, in <module>
    notes = payload.get("verification", {}).get("notes", [])
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'get'

The install actually completed: venv built, core/visual/google/ocr requirements installed, Playwright Chromium installed. Only WeasyPrint (optional PDF reports) failed because libgobject is missing. The installer just can't tell.

Root cause (three stacked bugs)

1. scripts/verify_environment.py:49 — WeasyPrint pollutes stdout at import time

DEPENDENCIES includes (\"weasyprint\", \"weasyprint\"). check_dependency calls importlib.import_module(\"weasyprint\"). On macOS without GLib, WeasyPrint writes a multi-line warning to stdout during its import side-effects, e.g.:

-----

WeasyPrint could not import some external libraries. Please carefully follow the installation steps before reporting an issue:
https://doc.courtbouillon.org/weasyprint/stable/first_steps.html#installation
https://doc.courtbouillon.org/weasyprint/stable/first_steps.html#troubleshooting 

-----

That text is emitted before the script prints its JSON, so the captured stdout is <weasyprint warning> + <json>.

2. scripts/bootstrap_environment.py:91-98parse_json_stdout can't handle mixed output

def parse_json_stdout(step: dict[str, Any]) -> dict[str, Any] | None:
    if not step[\"ok\"] or not step[\"stdout\"].strip():
        return None
    try:
        return json.loads(step[\"stdout\"])
    except json.JSONDecodeError:
        return None

json.loads on the polluted stdout raises JSONDecodeError and returns None. bootstrap_environment then sets verification = None, computes core_ready = bool(None and ...) = False, and reports ok: False even though every required step actually succeeded.

3. install.sh:52payload.get(\"verification\", {}) returns None, not {}

notes = payload.get(\"verification\", {}).get(\"notes\", [])

The default in dict.get(key, default) only fires when the key is missing. The bootstrap payload sets \"verification\": None explicitly, so payload.get(\"verification\", {}) returns None and the next .get crashes — masking the real diagnostics the helper was supposed to surface.

Suggested fixes

Fix 1 — Stop importing WeasyPrint inside the verifier (or silence its stdout)

Option A: Move WeasyPrint out of the import-loop check and probe it with a subprocess, redirecting stdout/stderr.

Option B: Wrap the WeasyPrint import in a contextlib.redirect_stdout(io.StringIO()) block inside check_dependency for that package.

Option C: Defer the WeasyPrint dependency check to a dedicated function that captures both streams.

Minimal Option B patch:

import contextlib, io

def check_dependency(module_name: str, package_name: str) -> dict[str, Any]:
    try:
        with contextlib.redirect_stdout(io.StringIO()), contextlib.redirect_stderr(io.StringIO()):
            module = importlib.import_module(module_name)
        version = getattr(module, \"__version__\", None)
        return {\"package\": package_name, \"module\": module_name, \"ok\": True, \"version\": version}
    except Exception as exc:
        return {\"package\": package_name, \"module\": module_name, \"ok\": False, \"error\": str(exc)}

Fix 2 — Make parse_json_stdout tolerant of leading noise

def parse_json_stdout(step: dict[str, Any]) -> dict[str, Any] | None:
    if not step[\"ok\"] or not step[\"stdout\"].strip():
        return None
    text = step[\"stdout\"]
    try:
        return json.loads(text)
    except json.JSONDecodeError:
        pass
    # Fall back: extract the last balanced top-level JSON object.
    start = text.find(\"{\")
    while start != -1:
        try:
            return json.loads(text[start:])
        except json.JSONDecodeError:
            start = text.find(\"{\", start + 1)
    return None

Fix 3 — Defensive .get chain in install.sh

dict.get(k, default) returns default only when k is absent. The bootstrap payload sets verification: None explicitly, so guard with or {}:

# install.sh — print_bootstrap_diagnostics
notes = (payload.get(\"verification\") or {}).get(\"notes\", [])

Same pattern applies to any other .get(..., {}).get(...) chains where the value could be None.

Verification

After applying the three fixes, bash install.sh on a fresh Mac without GLib should:

  1. Complete bootstrap and report [OK] Codex SEO installed successfully!
  2. Emit [WARN] Optional bootstrap groups failed: report (or similar) to flag the missing PDF capability
  3. Leave a working venv that passes verify_environment.py --json with core_ready: true, full_ready: false

Workaround for users hitting this today

The install actually works. Run:

~/.codex/skills/seo/.venv/bin/python ~/.codex/skills/seo/scripts/verify_environment.py

If core_ready: YES, the skill is usable. For PDF reports: brew install pango glib cairo libffi gdk-pixbuf then reinstall weasyprint in the venv.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions