chore: clean public repo surfaces#62
Conversation
There was a problem hiding this comment.
Code Review
This pull request updates documentation links in the README, removes an obsolete blog post, and introduces automated tests to ensure the integrity of the public documentation surface. New tests verify that local markdown links resolve correctly and that all files in the docs/posts directory are properly referenced. Feedback focused on improving the robustness of the link extraction logic to handle titles and angle brackets, correctly resolving repository-root-relative paths, and replacing simple string inclusion checks with a more precise link-to-file mapping to prevent false positives.
| return [...new Set( | ||
| [...matches] | ||
| .map((match) => match[1].trim()) | ||
| .map((target) => target.replace(/^<|>$/g, "")) | ||
| .map((target) => target.split("#")[0]?.split("?")[0] ?? "") | ||
| .filter(Boolean), | ||
| )].sort(); |
There was a problem hiding this comment.
The current link extraction logic does not account for Markdown links that include titles (e.g., [text](url "title")) or those wrapped in angle brackets (e.g., [text](<url>)). The match[1] will capture the entire string inside the parentheses, which will cause access() to fail when verifying the path. Additionally, it's safer to extract the URL part before splitting by fragments or queries.
return [...new Set(
[...matches]
.map((match) => {
const raw = match[1].trim();
// Handle links with titles or wrapped in angle brackets
return raw.startsWith("<") ? raw.split(">")[0].slice(1) : raw.split(/\s+/)[0];
})
.map((target) => target.split("#")[0]?.split("?")[0] ?? "")
.filter(Boolean),
)].sort();| const targets = extractLocalMarkdownLinks(contents); | ||
|
|
||
| for (const target of targets) { | ||
| const resolvedPath = path.resolve(path.dirname(path.join(ROOT_DIR, relativePath)), target); |
There was a problem hiding this comment.
Links starting with / (repo-root-relative) are not handled correctly by path.resolve when combined with a directory path. In Node.js, path.resolve(dir, '/path') resolves to the system root, not the repository root. These should be resolved relative to ROOT_DIR.
const resolvedPath = target.startsWith("/")
? path.join(ROOT_DIR, target)
: path.resolve(path.dirname(path.join(ROOT_DIR, relativePath)), target);| const surfaceFiles = await listPublicMarkdownSurfaces(); | ||
| const surfaceContents = await Promise.all(surfaceFiles.map((relativePath) => readRepoFile(relativePath))); | ||
| const referenceText = surfaceContents.join("\n"); | ||
|
|
||
| for (const postFile of postFiles) { | ||
| const linked = | ||
| referenceText.includes(`docs/posts/${postFile}`) || | ||
| referenceText.includes(`./docs/posts/${postFile}`) || | ||
| referenceText.includes(postFile); | ||
|
|
||
| assert.equal(linked, true, `docs/posts/${postFile} must be linked from a public markdown surface`); | ||
| } |
There was a problem hiding this comment.
The current check for unlinked posts uses a simple String.prototype.includes() on a concatenated string of all surface contents. This is prone to false positives (e.g., matching a filename in a sentence or a code block) and is inefficient as it re-reads and joins all files. It is more robust to reuse the link extraction and resolution logic to verify that each post file is actually the target of a resolved link.
const surfaceFiles = await listPublicMarkdownSurfaces();
const allResolvedTargets = new Set();
for (const file of surfaceFiles) {
const contents = await readRepoFile(file);
const targets = extractLocalMarkdownLinks(contents);
for (const target of targets) {
const resolvedPath = target.startsWith("/")
? path.join(ROOT_DIR, target)
: path.resolve(path.dirname(path.join(ROOT_DIR, file)), target);
allResolvedTargets.add(resolvedPath);
}
}
for (const postFile of postFiles) {
const fullPostPath = path.join(postDir, postFile);
assert.ok(allResolvedTargets.has(fullPostPath), `docs/posts/${postFile} must be linked from a public markdown surface`);
}
Summary
Test Plan