Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ hardcodes `robertdelanghe.dev`, `bounded.tools`, an account, or an email.

```
integrity/ verify-site · verify (sigstore) · gen-sitemanifest · gen-provenance · structure-audit · http-probe
gates/ sbom (gen + completeness) · shacl-runner · seo-gate · axe-gate (axe-core a11y) · vuln-gate (npm audit) · readability-gate · commonmark-runner · semantic (lone)
gates/ sbom (gen + completeness) · shacl-runner · seo-gate · axe-gate (axe-core a11y) · vuln-gate (npm audit) · html-validator-gate (vnu) · readability-gate · commonmark-runner · semantic (lone)
gates/conformance/ conformance-report — lone's conformance() projection (Node port of jsr:@bounded-systems/lone@0.4) + a generic HTML renderer
generators/ gen-cid (IPFS UnixFS) · gen-identity (did:web + VC) · openapi (static-API helper core)
emitters/ reprDigest (RFC 9530) · securityTxt (RFC 9116) · webManifest · markdown-sibling headers
Expand Down Expand Up @@ -68,6 +68,7 @@ in-process verifier). The Deno semantic runner pins its imports in
| `seo-gate.mjs` | `node …/seo-gate.mjs [distDir]` | `$DIST`. Optional `$SEO_ERROR_PAGE`, `$SEO_DEPLOY_SIDECARS`. Enforces canonical/title/description uniqueness + self-consistency, robots.txt (RFC 9309), sitemap, internal links. |
| `axe-gate.mjs` | `node …/axe-gate.mjs [distDir]` | `$DIST`. Optional `$AXE_PAGES` (comma list, default: every `*.html` in dist), `$AXE_TAGS` (default `wcag2a,wcag2aa,wcag21a,wcag21aa,wcag22aa`), `$AXE_IMPACT_THRESHOLD` (`minor`/`moderate`/`serious`/`critical`, default `serious`), `$AXE_RUNNER` (`playwright` (CI, needs `playwright` + `@axe-core/playwright` + `npx playwright install chromium`) \| `tezcatl` (macOS WebKit, local)), `$AXE_REPORT` (write the JSON report). Serves dist over an ephemeral origin (so assets resolve), runs **axe-core** per page, and **fails closed** on any violation at/above the threshold. The emitted report's `axe: { serious, critical }` envelope is exactly what `conformance-report`'s `a11y.axe-serious-critical` criterion consumes — a clean run is what lets a site honestly assert it. |
| `vuln-gate.mjs` | `node …/vuln-gate.mjs [projectDir]` | `$VULN_ROOT` (lockfile lives here, default `.`). Optional `$VULN_OMIT_DEV` (`true`→production deps only, default `true`), `$VULN_THRESHOLD` (highest tolerated known critical/high, default `0`), `$VULN_REPORT` (write the JSON report). Runs **`npm audit`** and **fails closed** when the known critical/high count exceeds the threshold. The report's `vulns: { knownCriticalOrHighVulns }` envelope is what `conformance-report`'s `security.no-critical-vulns` criterion consumes. |
| `html-validator-gate.mjs` | `node …/html-validator-gate.mjs [distDir]` | `$HTML_DIST`. Optional `$HTML_PAGES` (comma list, default: every `*.html`), `$HTML_THRESHOLD` (default `0`), `$HTML_REPORT`. Runs **vnu** (the Nu Html Checker, a self-contained Java jar — needs a JRE) `--errors-only` over the built pages and **fails closed** above the threshold. The report's `htmlValidator: { errors }` envelope is what `conformance-report`'s `html.validator-clean` criterion consumes. |
| `readability-gate.mjs` | `node …/readability-gate.mjs <corpus.json> [--strict]` | **The corpus is an input** the site assembles from its copy: a JSON array of `{id,text}` or an `{id:text}` map. Optional `$READABILITY_THRESHOLDS`, `$READABILITY_MIN_WORDS`, `$READABILITY_KNOWN_ACRONYMS`. WARN-only unless `--strict`. |
| `commonmark-runner.mjs` | `node …/commonmark-runner.mjs <renderer.mjs> [fixtures.json]` | **The site's markdown renderer module** (export `renderMarkdown`, or set `$COMMONMARK_RENDER_EXPORT`). Default fixtures pin a safe CommonMark subset + 4 hostile-HTML escapes; a site with a different renderer supplies its own `fixtures.json`. |
| `semantic/gate.ts` | `deno run --allow-read --allow-net …/gate.ts` | Built HTML in `$SEMANTIC_DIR` (default `dist/blog`); `$SEMANTIC_SELECTOR` (subject node, default `article`). Imports `jsr:@bounded-systems/lone`; any error-severity finding fails CI. |
Expand Down
17 changes: 17 additions & 0 deletions fixtures/html/bad.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8" />
<title>Bad fixture</title>
</head>
<body>
<main>
<h1>Bad fixture</h1>
<span>
<ul>
<li>a list is not allowed as a child of span — vnu errors here</li>
</ul>
</span>
</main>
</body>
</html>
13 changes: 13 additions & 0 deletions fixtures/html/good.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8" />
<title>Good fixture</title>
</head>
<body>
<main>
<h1>Good fixture</h1>
<p>A conformant HTML5 page — zero Nu HTML Checker errors.</p>
</main>
</body>
</html>
120 changes: 120 additions & 0 deletions gates/html-validator-gate.mjs
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
#!/usr/bin/env node
// HTML-validity gate — turns "the Nu HTML Checker passed once" into a
// CONTINUOUSLY-ENFORCED member of the conformance contract. It runs vnu (the Nu
// Html Checker, the reference HTML conformance checker, as a self-contained Java
// jar — headless, no browser, no network) over a project's BUILT pages and FAILS
// CLOSED (exit 1) when the error count exceeds a configurable threshold (default 0).
// The machine-readable result is exactly the shape lone's conformance() model
// consumes for `html.validator-clean` (`{ errors }`), so a clean run lets a site
// honestly assert that criterion — and a regression turns CI red.
//
// node gates/html-validator-gate.mjs [distDir] # build gate (exit 1 over threshold)
//
// Everything is config-driven; NOTHING about any one site is hard-coded:
// argv[2] / $HTML_DIST built output dir (default: "dist")
// $HTML_PAGES comma list of page paths under dist (default: every *.html)
// $HTML_THRESHOLD highest tolerated error count (default: 0)
// $HTML_REPORT path to write the JSON report (default: none)
//
// Requires a JRE on PATH (CI: actions/setup-java; the jar ships with `vnu-jar`).
// The pure parse/evaluation functions are exported for unit testing without Java.
import { writeFile, access, readdir } from "node:fs/promises";
import { resolve, join } from "node:path";
import { createRequire } from "node:module";
import { spawnSync } from "node:child_process";

// ── Pure core (Java-free; unit-testable) ─────────────────────────────────────

/** Extract error-type messages from a vnu `--format json` payload (string or object). */
export function parseVnu(payload) {
const json = typeof payload === "string" ? JSON.parse(payload || '{"messages":[]}') : (payload || {});
const messages = Array.isArray(json.messages) ? json.messages : [];
return messages.filter((m) => m && m.type === "error");
}

/** Evaluate parsed errors against the threshold. Pure: (errors[], threshold) → report. */
export function evaluateHtml(errors, threshold = 0) {
const count = errors.length;
return {
passed: count <= threshold,
threshold,
errors: count,
// The envelope lone's conformance() consumes for `html.validator-clean`.
htmlValidator: { errors: count },
detail: errors.slice(0, 20).map((e) => ({
page: (e.url || "").replace(/^file:/, ""),
line: e.lastLine,
message: e.message,
})),
};
}

// ── Impure runner ────────────────────────────────────────────────────────────

const require = createRequire(import.meta.url);

async function walkHtml(dir, base = dir) {
const out = [];
for (const e of await readdir(dir, { withFileTypes: true })) {
const p = join(dir, e.name);
if (e.isDirectory()) out.push(...await walkHtml(p, base));
else if (e.name.endsWith(".html")) out.push(p);
}
return out;
}

/** Run vnu over the given files; returns the error-type messages. vnu writes its
* JSON report to stderr and exits non-zero when errors exist, so we read stderr
* regardless of exit code. */
export function runVnu(files) {
const jar = String(require("vnu-jar"));
const res = spawnSync("java", ["-jar", jar, "--errors-only", "--format", "json", ...files], {
encoding: "utf8",
maxBuffer: 64 * 1024 * 1024,
});
if (res.error) throw new Error(`cannot run vnu (${res.error.message}). Is a JRE on PATH?`);
return parseVnu(res.stderr || '{"messages":[]}');
}

/** Walk → vnu → evaluate → report. Exposed for programmatic use and the kit's test. */
export async function runHtmlGate({ dist, pages, threshold = 0 }) {
const files = pages && pages.length
? pages.map((p) => resolve(dist, p))
: (await walkHtml(resolve(dist))).sort();
const report = evaluateHtml(runVnu(files), threshold);
report.pages = files.length;
return report;
}

// ── CLI ──────────────────────────────────────────────────────────────────────

async function main() {
const dist = resolve(process.argv[2] && !process.argv[2].startsWith("--") ? process.argv[2] : process.env.HTML_DIST || "dist");
const exists = async (p) => { try { await access(p); return true; } catch { return false; } };
if (!(await exists(dist))) { console.error(`✗ html-validator-gate: ${dist} not found — build first.`); process.exit(2); }

const threshold = Number.parseInt(process.env.HTML_THRESHOLD ?? "0", 10);
if (!Number.isInteger(threshold) || threshold < 0) {
console.error(`✗ html-validator-gate: $HTML_THRESHOLD must be an integer ≥ 0 (got "${process.env.HTML_THRESHOLD}")`);
process.exit(2);
}
const pages = (process.env.HTML_PAGES || "").split(",").map((s) => s.trim().replace(/^\//, "")).filter(Boolean);

const report = await runHtmlGate({ dist, pages, threshold });
if (process.env.HTML_REPORT) {
await writeFile(resolve(process.env.HTML_REPORT), JSON.stringify(report, null, 2) + "\n");
}

const line = `html-validator-gate: ${report.errors} Nu HTML Checker error(s) over ${report.pages} built page(s) · threshold ${threshold}`;
if (!report.passed) {
console.error(`✗ ${line}`);
for (const d of report.detail) console.error(` ${d.page} L${d.line}: ${d.message}`);
process.exit(1);
}
console.log(`✓ ${line}`);
}

// Only run the CLI when invoked directly (not when imported by a test).
if (import.meta.url === `file://${process.argv[1]}`) {
main().catch((e) => { console.error("✗ html-validator-gate: error —", e.stack || e.message); process.exit(1); });
}
16 changes: 15 additions & 1 deletion package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 3 additions & 1 deletion package.json
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
"ck-seo-gate": "./gates/seo-gate.mjs",
"ck-axe-gate": "./gates/axe-gate.mjs",
"ck-vuln-gate": "./gates/vuln-gate.mjs",
"ck-html-validator-gate": "./gates/html-validator-gate.mjs",
"ck-readability-gate": "./gates/readability-gate.mjs",
"ck-commonmark-runner": "./gates/commonmark-runner.mjs",
"ck-gen-cid": "./generators/gen-cid.mjs",
Expand All @@ -36,6 +37,7 @@
"linkedom": "^0.18.0",
"n3": "^1.17.3",
"rdf-validate-shacl": "^0.5.10",
"sigstore": "^5.0.0"
"sigstore": "^5.0.0",
"vnu-jar": "^26.6.24"
}
}
37 changes: 37 additions & 0 deletions test/run.mjs
Original file line number Diff line number Diff line change
Expand Up @@ -351,6 +351,43 @@ await test("gates/vuln-gate: parse + evaluate, e2e via npm audit", async () => {
}
});

// 15. html-validator-gate: pure parse/evaluate, then a best-effort e2e via real vnu.
await test("gates/html-validator-gate: parse + evaluate, e2e on fixtures", async () => {
const { parseVnu, evaluateHtml, runHtmlGate } = await import(join(KIT, "gates", "html-validator-gate.mjs"));

// (a) pure parse over vnu --format json payloads (errors-only filtering).
const errs = parseVnu({ messages: [
{ type: "error", message: "boom", url: "file:/p.html", lastLine: 9 },
{ type: "info", subType: "warning", message: "meh" },
{ type: "error", message: "bang", url: "file:/q.html", lastLine: 3 },
] });
if (errs.length !== 2) throw new Error(`expected 2 error messages (info dropped), got ${errs.length}`);
if (parseVnu('{"messages":[]}').length !== 0) throw new Error("empty payload must parse to 0");

// (b) pure threshold evaluation + the lone evidence envelope.
const okEval = evaluateHtml([], 0);
if (!okEval.passed || okEval.htmlValidator.errors !== 0) throw new Error("0 errors must pass with htmlValidator {0}");
const badEval = evaluateHtml(errs, 0);
if (badEval.passed || badEval.htmlValidator.errors !== 2) throw new Error("2 errors at threshold 0 must fail");

// (c) best-effort e2e on the good/bad fixtures with real vnu. A missing JRE is a
// tolerated skip (the pure logic above is the deterministic assertion).
const hasJava = spawnSync("java", ["-version"], { stdio: "ignore" }).status === 0;
const fixDir = join(FIX, "html");
try {
if (!hasJava) throw new Error("no JRE on PATH");
const bad = await runHtmlGate({ dist: fixDir, pages: ["bad.html"], threshold: 0 });
if (bad.passed || bad.errors < 1) throw new Error("known-bad fixture must fail (≥1 vnu error)");
const good = await runHtmlGate({ dist: fixDir, pages: ["good.html"], threshold: 0 });
if (!good.passed || good.errors !== 0) throw new Error("known-good fixture must pass (0 vnu errors)");
ok("gates/html-validator-gate: parse + evaluate, e2e on fixtures",
`pure logic asserted · e2e (vnu): bad=${bad.errors} error(s), good=clean`);
} catch (e) {
if (/must (pass|fail)|expected|envelope/.test(e.message)) throw e;
ok("gates/html-validator-gate: parse + evaluate, e2e on fixtures", `pure logic asserted · e2e SKIPPED (${e.message.split("\n")[0]})`);
}
});

await rm(work, { recursive: true, force: true });
console.log(`\n${failed ? "✗" : "✓"} conformance-kit tests: ${passed} passed, ${failed} failed`);
process.exit(failed ? 1 : 0);
Loading