Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
61 changes: 60 additions & 1 deletion packages/control-plane/src/auth/github-app.ts
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
*/

import type { InstallationRepository } from "@open-inspect/shared";
import { createLogger } from "../logger";

/** Timeout for individual GitHub API requests (ms). */
export const GITHUB_FETCH_TIMEOUT_MS = 60_000;
Expand All @@ -24,6 +25,7 @@ export const INSTALLATION_TOKEN_MIN_REMAINING_MS = 5 * 60 * 1000;
export const INSTALLATION_TOKEN_CACHE_MAX_TTL_SECONDS = 3600;

const INSTALLATION_TOKEN_CACHE_KEY_PREFIX = "github:installation-token:v1";
const logger = createLogger("github-app");

interface InstallationTokenCacheBindings {
REPOS_CACHE?: KVNamespace;
Expand Down Expand Up @@ -490,10 +492,41 @@ export async function listInstallationRepositories(
}
}

const roundedTokenGenerationMs = Math.round(tokenGenerationMs * 100) / 100;
const sampleRepos = allRepos.slice(0, 5).map((repo) => repo.fullName);
const installationContext = {
app_id: config.appId,
installation_id: config.installationId,
};

logger.info("github.installation_repos.listed", {
...installationContext,
repository_selection: first.data.repository_selection,
total_count: totalCount,
total_pages: totalPages,
total_repos: allRepos.length,
token_generation_ms: roundedTokenGenerationMs,
page_timings: pageTiming,
sample_repos: sampleRepos,
Comment on lines +495 to +510
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Do not log private repo or branch names by default.

sample_repos and sample_branches can contain confidential names from private GitHub installations. Counts/timings are useful without exposing names; gate samples behind an explicit debug flag or emit hashes/counts only.

🔒 Proposed safer diagnostics
-  const sampleRepos = allRepos.slice(0, 5).map((repo) => repo.fullName);
+  const sampleRepoCount = Math.min(allRepos.length, 5);
@@
-    sample_repos: sampleRepos,
+    sample_repo_count: sampleRepoCount,
@@
-  const branchSample = branches.slice(0, 10).map((branch) => branch.name);
+  const sampleBranchCount = Math.min(branches.length, 10);
@@
-    sample_branches: branchSample,
+    sample_branch_count: sampleBranchCount,

Also applies to: 646-655

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/control-plane/src/auth/github-app.ts` around lines 495 - 510, The
log currently emits plain sample_repos (and similarly sample_branches elsewhere)
which can leak private names; change the logger.info call in
github.installation_repos.listed to avoid raw names by default: remove
sample_repos/sample_branches and instead emit counts (e.g., total_repos_sampled:
sampleRepos.length) and/or deterministic non-reversible fingerprints (e.g.,
sha256 hashes) only when an explicit opt-in debug flag is set (e.g.,
config.emitRepoSamples or logger.isDebugEnabled). Update the code that builds
sampleRepos/sampleBranches (used around the logger.info block and the block at
lines ~646-655) to compute either hashed values or skip populating them unless
the debug flag is true, and ensure the public keys in installationContext remain
unchanged.

});

if (allRepos.length === 0) {
logger.warn("github.installation_repos.empty", {
...installationContext,
repository_selection: first.data.repository_selection,
total_count: totalCount,
total_pages: totalPages,
diagnosis:
first.data.repository_selection === "selected"
? "installation_has_no_selected_repositories_or_wrong_installation_id"
: "installation_has_no_accessible_repositories",
});
}

return {
repos: allRepos,
timing: {
tokenGenerationMs: Math.round(tokenGenerationMs * 100) / 100,
tokenGenerationMs: roundedTokenGenerationMs,
pages: pageTiming,
totalPages,
totalRepos: allRepos.length,
Expand Down Expand Up @@ -576,9 +609,11 @@ export async function listRepositoryBranches(
const token = await getCachedInstallationToken(config, env);
const branches: { name: string }[] = [];
let page = 1;
const pageTimings: Array<{ page: number; fetchMs: number; branchCount: number }> = [];

// Paginate through branches (100 per page, cap at 500)
while (branches.length < 500) {
const pageStart = performance.now();
const response = await fetchWithTimeout(
`https://api.github.com/repos/${owner}/${repo}/branches?per_page=100&page=${page}`,
{
Expand All @@ -598,11 +633,35 @@ export async function listRepositoryBranches(

const data = (await response.json()) as { name: string }[];
branches.push(...data.map((b) => ({ name: b.name })));
pageTimings.push({
page,
fetchMs: Math.round((performance.now() - pageStart) * 100) / 100,
branchCount: data.length,
});

if (data.length < 100) break;
page++;
}

const branchSample = branches.slice(0, 10).map((branch) => branch.name);
logger.info("github.repository_branches.listed", {
app_id: config.appId,
installation_id: config.installationId,
repo_owner: owner,
repo_name: repo,
total_branches: branches.length,
page_timings: pageTimings,
sample_branches: branchSample,
});
if (branches.length === 0) {
logger.warn("github.repository_branches.empty", {
app_id: config.appId,
installation_id: config.installationId,
repo_owner: owner,
repo_name: repo,
});
}

return branches;
}

Expand Down
57 changes: 56 additions & 1 deletion packages/control-plane/src/routes/repos.ts
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,13 @@ async function refreshReposCache(env: Env, traceId?: string): Promise<void> {
logger.info("Repo fetch completed", {
trace_id: traceId,
total_repos: repos.length,
sample_repos: repos.slice(0, 5).map((repo) => repo.fullName),
});
if (repos.length === 0) {
logger.warn("Repo fetch returned no repositories", {
trace_id: traceId,
});
}
Comment on lines 51 to +60
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid duplicating private repo and branch names in route logs.

The new sample_repos / sample_branches fields can leak private repository and branch names to centralized logs. Since counts and cache state already diagnose empty/stale behavior, prefer sample counts or hashes unless explicit debug logging is enabled.

🔒 Proposed safer fields
-      sample_repos: repos.slice(0, 5).map((repo) => repo.fullName),
+      sample_repo_count: Math.min(repos.length, 5),
@@
-      sample_repos: cached.repos.slice(0, 5).map((repo) => repo.fullName),
+      sample_repo_count: Math.min(cached.repos.length, 5),
@@
-    sample_repos: repos.slice(0, 5).map((repo) => repo.fullName),
+    sample_repo_count: Math.min(repos.length, 5),
@@
-    sample_repos: enrichedRepos.slice(0, 5).map((repo) => repo.fullName),
+    sample_repo_count: Math.min(enrichedRepos.length, 5),
@@
-      sample_branches: branches.slice(0, 10).map((branch) => branch.name),
+      sample_branch_count: Math.min(branches.length, 10),

Also applies to: 145-158, 192-201, 236-247, 352-365

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/control-plane/src/routes/repos.ts` around lines 51 - 60, The logs
currently include sensitive repo and branch names via the sample_repos and
sample_branches fields (see logger.info usages that map
repos.slice(0,5).map(repo => repo.fullName) and similar branch mappings); change
these logger payloads to avoid plain names by emitting only counts, boolean
cache/stale flags, or non-reversible hashes (or indices) instead, and ensure any
name-level sampling is gated behind an explicit debug flag (e.g., if (isDebug) {
... }) so production logging of logger.info with traceId and repos/branches uses
safe fields like total_repos, sample_count, or hashed_sample_ids rather than
repo.fullName or branch names.

} catch (e) {
if (e instanceof SourceControlProviderError && e.errorType === "permanent" && !e.httpStatus) {
logger.warn("SCM provider not configured, skipping repo refresh", {
Expand Down Expand Up @@ -136,6 +142,21 @@ async function handleListRepos(
if (cached) {
const isFresh = cached.freshUntil && Date.now() < cached.freshUntil;

logger.info("Serving repos cache", {
trace_id: ctx.trace_id,
cache_state: isFresh ? "fresh" : "stale",
cached_at: cached.cachedAt,
repo_count: cached.repos.length,
sample_repos: cached.repos.slice(0, 5).map((repo) => repo.fullName),
});
if (cached.repos.length === 0) {
logger.warn("Repos cache returned no repositories", {
trace_id: ctx.trace_id,
cache_state: isFresh ? "fresh" : "stale",
cached_at: cached.cachedAt,
});
}

if (!isFresh && ctx.executionCtx) {
// Stale — serve immediately but refresh in background
logger.info("Serving stale repos cache, refreshing in background", {
Expand Down Expand Up @@ -171,7 +192,13 @@ async function handleListRepos(
logger.info("Repo fetch completed", {
trace_id: ctx.trace_id,
total_repos: repos.length,
sample_repos: repos.slice(0, 5).map((repo) => repo.fullName),
});
if (repos.length === 0) {
logger.warn("Repo fetch returned no repositories", {
trace_id: ctx.trace_id,
});
}

const metadataStore = new RepoMetadataStore(env.DB);
let metadataMap: Map<string, RepoMetadata>;
Expand Down Expand Up @@ -206,6 +233,19 @@ async function handleListRepos(
logger.warn("Failed to cache repos list", { error: e instanceof Error ? e : String(e) });
}

logger.info("Returning repos response", {
trace_id: ctx.trace_id,
cached: false,
repo_count: enrichedRepos.length,
sample_repos: enrichedRepos.slice(0, 5).map((repo) => repo.fullName),
});
if (enrichedRepos.length === 0) {
logger.warn("Repos response contains no repositories", {
trace_id: ctx.trace_id,
cached: false,
});
}

return json({
repos: enrichedRepos,
cached: false,
Expand Down Expand Up @@ -300,7 +340,7 @@ async function handleListBranches(
_request: Request,
env: Env,
match: RegExpMatchArray,
_ctx: RequestContext
ctx: RequestContext
): Promise<Response> {
const params = extractRepoParams(match);
if (params instanceof Response) return params;
Expand All @@ -309,6 +349,20 @@ async function handleListBranches(
try {
const provider = createRouteSourceControlProvider(env);
const branches = await provider.listBranches({ owner, name });
logger.info("Branches fetched", {
trace_id: ctx.trace_id,
repo_owner: owner,
repo_name: name,
branch_count: branches.length,
sample_branches: branches.slice(0, 10).map((branch) => branch.name),
});
if (branches.length === 0) {
logger.warn("Branch fetch returned no branches", {
trace_id: ctx.trace_id,
repo_owner: owner,
repo_name: name,
});
}
return json({ branches });
} catch (e) {
if (e instanceof SourceControlProviderError && e.errorType === "permanent" && !e.httpStatus) {
Expand All @@ -318,6 +372,7 @@ async function handleListBranches(
error: e instanceof Error ? e : String(e),
repo_owner: owner,
repo_name: name,
trace_id: ctx.trace_id,
});
return error("Failed to list branches", 500);
}
Expand Down
44 changes: 43 additions & 1 deletion packages/web/src/app/api/repos/[owner]/[name]/branches/route.ts
Original file line number Diff line number Diff line change
Expand Up @@ -10,19 +10,61 @@ export async function GET(
) {
const session = await getServerSession(authOptions);
if (!session?.user) {
console.warn("[branches] unauthorized branch list request");
return NextResponse.json({ error: "Unauthorized" }, { status: 401 });
}

const { owner, name } = await params;
const requester = session.user.login ?? session.user.email ?? "unknown";

console.info("[branches] fetching branches from control plane", {
requester,
repo: `${owner}/${name}`,
});
Comment on lines +18 to +23
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Redact requester and branch-name samples from web logs.

requester can fall back to a full email, and sampleBranches exposes private branch names. Keep the useful diagnostics while logging requester type/counts instead of raw identifiers and names.

🔒 Proposed safer payloads
-  const requester = session.user.login ?? session.user.email ?? "unknown";
+  const requesterType = session.user.login ? "github_login" : session.user.email ? "email" : "unknown";
@@
-    requester,
+    requesterType,
     repo: `${owner}/${name}`,
@@
-        requester,
+        requesterType,
         repo: `${owner}/${name}`,
         branchCount: branches.length,
-        sampleBranches: branches.slice(0, 10).map((branch) => branch.name),
+        sampleBranchCount: Math.min(branches.length, 10),
@@
-          requester,
+          requesterType,
           repo: `${owner}/${name}`,
@@
-        requester,
+        requesterType,
         repo: `${owner}/${name}`,

Also applies to: 41-59

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/web/src/app/api/repos/`[owner]/[name]/branches/route.ts around lines
18 - 23, The logs currently emit raw identifiers (requester) and branch names
(sampleBranches); update the logging in the branches route to redact those
values: derive a requester type label from session.user (e.g., "github_user",
"email_user", "unknown") instead of logging the actual login/email, and when
logging branch info from sampleBranches/log payloads only emit safe diagnostics
such as branch count and either masked branch-name samples or a
hash/first-character-and-length summary (no full names). Locate and change the
logging around the requester variable and any uses of sampleBranches in the
route handler (the requester variable and the branch-sampling/logging block
referenced in the file) to implement these redactions while preserving counts
and context.


try {
const response = await controlPlaneFetch(
`/repos/${encodeURIComponent(owner)}/${encodeURIComponent(name)}/branches`
);
console.info("[branches] control plane response received", {
requester,
repo: `${owner}/${name}`,
status: response.status,
ok: response.ok,
});
const data = await response.json();
if (response.ok) {
const branches =
typeof data === "object" && data !== null && "branches" in data
? ((data as { branches?: Array<{ name: string }> }).branches ?? [])
: [];
console.info("[branches] branches loaded", {
requester,
repo: `${owner}/${name}`,
branchCount: branches.length,
sampleBranches: branches.slice(0, 10).map((branch) => branch.name),
});
if (branches.length === 0) {
console.warn("[branches] branch list is empty", {
requester,
repo: `${owner}/${name}`,
});
}
} else {
console.error("[branches] control plane API error", {
requester,
repo: `${owner}/${name}`,
status: response.status,
data,
});
}
return NextResponse.json(data, { status: response.status });
} catch (error) {
console.error("Failed to fetch branches:", error);
console.error("[branches] unexpected error fetching branches", {
requester,
repo: `${owner}/${name}`,
error,
});
return NextResponse.json({ error: "Failed to fetch branches" }, { status: 500 });
}
}
39 changes: 37 additions & 2 deletions packages/web/src/app/api/repos/route.ts
Original file line number Diff line number Diff line change
Expand Up @@ -13,17 +13,34 @@ interface ControlPlaneReposResponse {
export async function GET() {
const session = await getServerSession(authOptions);
if (!session) {
console.warn("[repos] unauthorized repository list request");
return NextResponse.json({ error: "Unauthorized" }, { status: 401 });
}

const requester = session.user?.login ?? session.user?.email ?? "unknown";

console.info("[repos] fetching repositories from control plane", {
requester,
});
Comment on lines +20 to +24
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid logging requester identifiers and private repo samples.

requester can be a full email, and sampleRepos can expose private repository names. Use requester type plus counts/cache metadata, or hash identifiers if correlation is required.

🔒 Proposed safer payloads
-  const requester = session.user?.login ?? session.user?.email ?? "unknown";
+  const requesterType = session.user?.login ? "github_login" : session.user?.email ? "email" : "unknown";
@@
-    requester,
+    requesterType,
@@
-      requester,
+      requesterType,
       repoCount: data.repos.length,
       cached: data.cached,
       cachedAt: data.cachedAt,
-      sampleRepos: data.repos.slice(0, 5).map((repo) => repo.fullName),
+      sampleRepoCount: Math.min(data.repos.length, 5),
@@
-        requester,
+        requesterType,
         cached: data.cached,

Also applies to: 52-65

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/web/src/app/api/repos/route.ts` around lines 20 - 24, The current
console.info calls log sensitive identifiers (requester) and repository names
(sampleRepos); update the logging in the route to avoid PII by replacing direct
requester values with a non-identifying attribute (e.g., requesterType or a
hashed/salted identifier) and by replacing sampleRepos payloads with safe
metadata (e.g., sampleRepos.length, cache status, and optionally anonymized or
hashed repo IDs) in the console.info statements that reference requester and
sampleRepos; locate and change the usages of the variables requester and
sampleRepos and any console.info/console.log that emits them so logs contain
only counts, cache metadata, or hashed identifiers instead of raw emails/repo
names.


try {
// Fetch repositories from control plane using GitHub App installation token.
// This ensures we only show repos the App has access to, not all repos the user can see.
const response = await controlPlaneFetch("/repos");

console.info("[repos] control plane response received", {
requester,
status: response.status,
ok: response.ok,
});

if (!response.ok) {
const error = await response.text();
console.error("Control plane API error:", error);
console.error("[repos] control plane API error", {
requester,
status: response.status,
error,
});
return NextResponse.json(
{ error: "Failed to fetch repositories" },
{ status: response.status }
Expand All @@ -32,10 +49,28 @@ export async function GET() {

const data: ControlPlaneReposResponse = await response.json();

console.info("[repos] repositories loaded", {
requester,
repoCount: data.repos.length,
cached: data.cached,
cachedAt: data.cachedAt,
sampleRepos: data.repos.slice(0, 5).map((repo) => repo.fullName),
});
if (data.repos.length === 0) {
console.warn("[repos] repository list is empty", {
requester,
cached: data.cached,
cachedAt: data.cachedAt,
});
}

// The control plane returns repos in the format we need
return NextResponse.json({ repos: data.repos });
} catch (error) {
console.error("Error fetching repos:", error);
console.error("[repos] unexpected error fetching repositories", {
requester,
error,
});
return NextResponse.json({ error: "Internal server error" }, { status: 500 });
}
}
28 changes: 28 additions & 0 deletions packages/web/src/lib/auth.ts
Original file line number Diff line number Diff line change
Expand Up @@ -47,12 +47,40 @@ export const authOptions: NextAuthOptions = {
};

const githubProfile = profile as { login?: string };
const githubUsername = githubProfile.login?.trim().toLowerCase();
const emailDomain = user.email?.split("@")[1]?.trim().toLowerCase();
const usernameMatch = githubUsername ? config.allowedUsers.includes(githubUsername) : false;
const domainMatch = emailDomain ? config.allowedDomains.includes(emailDomain) : false;
const allowAllMode =
config.allowedDomains.length === 0 &&
config.allowedUsers.length === 0 &&
config.unsafeAllowAllUsers;

const isAllowed = checkAccessAllowed(config, {
githubUsername: githubProfile.login,
email: user.email ?? undefined,
});

console.info("[auth] signIn access check", {
githubUsername: githubUsername ?? null,
emailDomain: emailDomain ?? null,
allowedUsers: config.allowedUsers,
allowedDomains: config.allowedDomains,
unsafeAllowAllUsers: config.unsafeAllowAllUsers,
usernameMatch,
domainMatch,
allowAllMode,
isAllowed,
});

if (!isAllowed) {
console.warn("[auth] signIn denied", {
githubUsername: githubUsername ?? null,
emailDomain: emailDomain ?? null,
allowedUsers: config.allowedUsers,
allowedDomains: config.allowedDomains,
unsafeAllowAllUsers: config.unsafeAllowAllUsers,
});
Comment on lines +64 to +83
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid logging user identifiers and allowlist contents.

These logs emit githubUsername, allowedUsers, and allowedDomains on every sign-in attempt. That exposes user identifiers and access-policy details to log sinks; keep the diagnostic booleans/counts instead, or hash identifiers before logging.

🔒 Proposed safer log payload
       console.info("[auth] signIn access check", {
-        githubUsername: githubUsername ?? null,
-        emailDomain: emailDomain ?? null,
-        allowedUsers: config.allowedUsers,
-        allowedDomains: config.allowedDomains,
+        hasGithubUsername: Boolean(githubUsername),
+        hasEmailDomain: Boolean(emailDomain),
+        allowedUserCount: config.allowedUsers.length,
+        allowedDomainCount: config.allowedDomains.length,
         unsafeAllowAllUsers: config.unsafeAllowAllUsers,
         usernameMatch,
         domainMatch,
@@
       if (!isAllowed) {
         console.warn("[auth] signIn denied", {
-          githubUsername: githubUsername ?? null,
-          emailDomain: emailDomain ?? null,
-          allowedUsers: config.allowedUsers,
-          allowedDomains: config.allowedDomains,
+          hasGithubUsername: Boolean(githubUsername),
+          hasEmailDomain: Boolean(emailDomain),
+          allowedUserCount: config.allowedUsers.length,
+          allowedDomainCount: config.allowedDomains.length,
           unsafeAllowAllUsers: config.unsafeAllowAllUsers,
         });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/web/src/lib/auth.ts` around lines 64 - 83, The logs in the sign-in
access check and denial paths in auth.ts are emitting sensitive identifiers
(githubUsername, emailDomain) and full allowlist contents (allowedUsers,
allowedDomains); update the signIn logging to avoid raw identifiers: replace
githubUsername/emailDomain with either boolean flags (e.g., usernameMatch,
domainMatch) or hashed/masked values and replace allowedUsers/allowedDomains
with counts (e.g., allowedUsersCount, allowedDomainsCount) or presence flags;
apply the same change to the warning block that logs sign-in denial and keep
other diagnostic fields (unsafeAllowAllUsers, allowAllMode, isAllowed) intact.
Use the variables seen in this diff (githubUsername, emailDomain, allowedUsers,
allowedDomains, unsafeAllowAllUsers, usernameMatch, domainMatch, allowAllMode,
isAllowed) as the targets to modify the logged payloads.

return false;
}
return true;
Expand Down
Loading