Skip to content

forge-content: multi-platform content shortcode replacing github-content#7

Merged
Arty2 merged 1 commit intomasterfrom
claude/forge-content
Apr 29, 2026
Merged

forge-content: multi-platform content shortcode replacing github-content#7
Arty2 merged 1 commit intomasterfrom
claude/forge-content

Conversation

@Arty2
Copy link
Copy Markdown
Owner

@Arty2 Arty2 commented Apr 29, 2026

Why

forge-meta (PR #2) generalised the GitHub metadata block to multi-forge. The matching content-fetching shortcode (github-content.html) hadn't been migrated — it was still GitHub-only, and unlike its renderer-side cousin it had no HTML sanitisation. A README from a public repo could ship <script> / <iframe> / onerror= straight into the consumer site's published HTML because Goldmark's unsafe = true (configured for inline-HTML support in shortcodes) passes raw HTML through.

This PR replaces it with a multi-platform shortcode and adds two new security layers on top of the existing shortcode-delimiter neutralisation.

What

Shared resolver. New layouts/partials/forge-resolve.html owns parsing of Forge / legacy Github values, host detection, platform auto-detection (github.com / gitlab.com / codeberg.org / params.forgeContent.gitlabHosts), and label resolution. Returns a dict via return. forge-meta.html is refactored to call it; the new shortcode calls it the same way. The two templates can no longer drift on detection logic.

{{< forge-content >}} (new shortcode) accepts:

param default effect
repository host/owner/repo (unified) or owner/repo (legacy GitHub).
branch master Git ref.
path (empty) File path. Empty: GitHub uses /readme endpoint; GitLab + Forgejo probe README.md, README, readme.md in order.
platform auto-detect Override (github / gitlab / forgejo).
unsafe false Allow <svg> / <math> from trusted sources.

Three-layer security before markdownify:

  1. Shortcode delimiter neutralisation (carried forward verbatim from the legacy shortcode). {{<, {{%, >}}, %}} replaced with full-width lookalikes — prevents server-side template injection.
  2. HTML tag denylist (new). A single replaceRE entity-escapes opening + closing forms of: script style iframe frame frameset noframes object embed applet form input button textarea select option optgroup fieldset legend link meta base noscript, plus svg / math unless unsafe="true". Disallowed tags render as visible escaped text — they cannot execute.
  3. Dangerous attribute strip (new). Event handlers (on*=...), javascript: URIs in href / src / xlink:href, and IE-era style="...expression(..." are stripped wherever they appear; the containing tag survives but loses the attack vector.

This is a denylist, not a strict GFM allowlist — sufficient for typical READMEs, conservative on dangerous primitives. Documented in the README's new "Forge content" section.

Hard break (per user direction). github-content.html is deleted rather than wrapped. Sites using {{< github-content >}} must rename to {{< forge-content >}} and prefix the host (e.g. repository="github.com/owner/repo"). Migration is one find-and-replace.

Other changes:

  • i18n/{en,el}.yaml: github-errorforge-error (no GitHub-specific text).
  • README.md: new "Forge content" section; shortcode list updated.
  • exampleSite/content/posts/github-demo/index.md: migrated as the live smoke test.

Test plan

  • cd exampleSite && hugo --gc clean; no DEPRECATION warnings; demo post renders.
  • When the API fetch fails (sandbox 403), shortcode emits the forge-error fallback link instead of erroring out.
  • Refactored forge-meta.html is byte-equivalent to master on pages with Forge: front-matter (same code path, just extracted).
  • CI verification — once GitHub Actions runs in a network-enabled environment, the demo post fetches the upstream README; inspect rendered HTML to confirm:
    • shortcode delimiters in fetched content (if any) appear as full-width characters
    • any <script> / <iframe> / <form> etc. in the README appear as &lt;script&gt; etc.
    • any onerror= / javascript: URIs are absent from the rendered output
  • Manual: try a GitLab repo (repository="gitlab.com/group/project") and a Codeberg repo to exercise the probe.
  • Migrate any downstream sites that used {{< github-content >}} (rename + add host prefix).

Independent of PR #5 (polish-pass) and PR #6 (print-stylesheet) — branches off master.


Generated by Claude Code

Pairs with `forge-meta`. The README-inlining responsibility that was
single-platform (`{{< github-content >}}`) is now multi-platform and
hardened against XSS as well as the prior shortcode-injection vector.

Shared platform resolution
--------------------------
`layouts/partials/forge-resolve.html` (NEW) owns the unified-vs-legacy
forge value parsing, host detection, platform auto-detection
(github.com / gitlab.com / codeberg.org / configured `gitlabHosts`),
and label resolution. Returns a dict via `return`. `forge-meta.html`
is refactored to call it (its sections 1–4 collapse to one partial
invocation); `forge-content.html` calls it the same way. Now the two
templates can never drift on platform detection.

forge-content shortcode
-----------------------
`layouts/shortcodes/forge-content.html` (NEW) accepts:

  repository  required: "host/owner/repo" (unified) or "owner/repo"
              (legacy, GitHub assumed)
  branch      default "master"
  path        optional file path; empty fetches README
  platform    optional override
  unsafe      bool; allows <svg>/<math> from trusted sources

Per-platform fetch:
  - GitHub: dedicated `/readme` endpoint when `path` is empty
  - GitLab + Forgejo: probe README.md, README, readme.md in order
    via the standard `/contents/{name}` endpoint; first hit wins.
    Hugo's daily cache key is reused so the probe costs at most 3
    requests per repo per build day.

Three-layer security
--------------------
Untrusted remote markdown passes through three filters before
`markdownify`:

  1. Hugo shortcode delimiter neutralisation (carried forward
     verbatim from the legacy shortcode): `{{<` / `{{%` / `>}}` / `%}}`
     replaced with full-width lookalikes `{{` / `}}`. Prevents
     server-side template injection against the consumer site.

  2. HTML tag denylist (NEW). `replaceRE` entity-escapes opening and
     closing forms of:
       script style iframe frame frameset noframes object embed
       applet form input button textarea select option optgroup
       fieldset legend link meta base noscript
     Plus svg / math unless `unsafe="true"` (allowlisted only for
     trusted sources). Disallowed tags render as visible escaped
     text in the published HTML — they cannot execute or load
     remote resources.

  3. Dangerous attribute strip (NEW). Event handlers (`on*=...`),
     `javascript:` URIs in `href` / `src` / `xlink:href`, and IE-era
     `style="...expression(..."` are stripped wherever they appear.
     The containing tag survives but loses the attack vector.

This is a denylist, not a strict GFM allowlist. It accepts the
dangerous tags GFM rejects but does not enforce GFM's full
allowlist. Sufficient for typical READMEs from trusted maintainers;
for syndicating attacker-controlled content, run a dedicated
sanitiser as a build step.

Compatibility
-------------
Per user direction, this is a hard break — `github-content.html` is
deleted rather than wrapped. Sites still using `{{< github-content >}}`
will fail to build until they migrate (rename + add host prefix to
`repository`). The exampleSite demo post is migrated.

Other changes
-------------
- `i18n/en.yaml` + `i18n/el.yaml`: `github-error` → `forge-error`
  (more generic message, no GitHub-specific text).
- `README.md`: new "Forge content" section under the modifier-docs
  cluster, documenting all parameters, the per-platform behaviour
  (GitHub auto-readme vs. probe), and the three security passes;
  shortcode list updated.
- `exampleSite/content/posts/github-demo/index.md`: title +
  description updated; shortcode call migrated to forge-content.
@Arty2 Arty2 merged commit eb599c4 into master Apr 29, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants