Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 8 additions & 8 deletions .apm/agents/auth-expert.agent.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,35 +23,35 @@ If a code change contradicts the mermaid diagram, the diagram (and matching doc

## Core Knowledge

- **Token prefixes**: Fine-grained PATs (`github_pat_`), classic PATs (`ghp_`), OAuth user-to-server (`ghu_` e.g. `gh auth login`), OAuth app (`gho_`), GitHub App install (`ghs_`), GitHub App refresh (`ghr_`)
- **EMU (Enterprise Managed Users)**: Use standard PAT prefixes (`ghp_`, `github_pat_`). There is NO special prefix for EMU it's a property of the account, not the token. EMU tokens are enterprise-scoped and cannot access public github.com repos. EMU orgs can exist on github.com or *.ghe.com.
- **Token prefixes**: Fine-grained PATs (`github_pat_`), classic PATs (`ghp_`), OAuth user-to-server (`ghu_` -- e.g. `gh auth login`), OAuth app (`gho_`), GitHub App install (`ghs_`), GitHub App refresh (`ghr_`)
- **EMU (Enterprise Managed Users)**: Use standard PAT prefixes (`ghp_`, `github_pat_`). There is NO special prefix for EMU -- it's a property of the account, not the token. EMU tokens are enterprise-scoped and cannot access public github.com repos. EMU orgs can exist on github.com or *.ghe.com.
- **Host classification**: github.com (public), *.ghe.com (no public repos), GHES (`GITHUB_HOST`), ADO
- **Git credential helpers**: macOS Keychain, Windows Credential Manager, `gh auth`, `git credential fill`
- **Rate limiting**: 60/hr unauthenticated, 5000/hr authenticated, primary (403) vs secondary (429)

## APM Architecture

- **AuthResolver** (`src/apm_cli/core/auth.py`): Single source of truth. Per-(host, org) resolution. Frozen `AuthContext` for thread safety.
- **Token precedence**: `GITHUB_APM_PAT_{ORG}` `GITHUB_APM_PAT` `GITHUB_TOKEN` `GH_TOKEN` `git credential fill`
- **Token precedence**: `GITHUB_APM_PAT_{ORG}` -> `GITHUB_APM_PAT` -> `GITHUB_TOKEN` -> `GH_TOKEN` -> `git credential fill`
- **Fallback chains**: unauth-first for validation (save rate limits), auth-first for download
- **GitHubTokenManager** (`src/apm_cli/core/token_manager.py`): Low-level token lookup, wrapped by AuthResolver

## Decision Framework

When reviewing or writing auth code:

1. **Every remote operation** must go through AuthResolver no direct `os.getenv()` for tokens
1. **Every remote operation** must go through AuthResolver -- no direct `os.getenv()` for tokens
2. **Per-dep resolution**: Use `resolve_for_dep(dep_ref)`, never `self.github_token` instance vars
3. **Host awareness**: Global env vars are checked for all hosts (no host-gating). `try_with_fallback()` retries with `git credential fill` if the token is rejected. HTTPS is the transport security boundary. *.ghe.com and ADO always require auth (no unauthenticated fallback).
4. **Error messages**: Always use `build_error_context()` never hardcode env var names
4. **Error messages**: Always use `build_error_context()` -- never hardcode env var names
5. **Thread safety**: AuthContext is resolved before `executor.submit()`, passed per-worker

## Common Pitfalls

- EMU PATs on public github.com repos will fail silently (you cannot detect EMU from prefix)
- EMU PATs on public github.com repos -> will fail silently (you cannot detect EMU from prefix)
- `git credential fill` only resolves per-host, not per-org
- `_build_repo_url` must accept token param, not use instance var
- Windows: `GIT_ASKPASS` must be `'echo'` not empty string
- Classic PATs (`ghp_`) work cross-org but are being deprecated prefer fine-grained
- ADO uses Basic auth with base64-encoded `:PAT` different from GitHub bearer token flow
- Classic PATs (`ghp_`) work cross-org but are being deprecated -- prefer fine-grained
- ADO uses Basic auth with base64-encoded `:PAT` -- different from GitHub bearer token flow
- ADO also supports AAD bearer tokens via `az account get-access-token` (resource `499b84ac-1321-427f-aa17-267ca6975798`); precedence is `ADO_APM_PAT` -> az bearer -> fail. Stale PATs (401) silently fall back to the bearer with a `[!]` warning. See the auth skill for the four diagnostic cases.
4 changes: 2 additions & 2 deletions .apm/skills/auth/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ name: auth
description: >
Activate when code touches token management, credential resolution, git auth
flows, GITHUB_APM_PAT, ADO_APM_PAT, AuthResolver, HostInfo, AuthContext, or
any remote host authentication even if 'auth' isn't mentioned explicitly.
any remote host authentication -- even if 'auth' isn't mentioned explicitly.
---

# Auth Skill
Expand Down Expand Up @@ -35,7 +35,7 @@ ADO hosts (`dev.azure.com`, `*.visualstudio.com`) resolve auth in this order:
2. AAD bearer via `az account get-access-token --resource 499b84ac-1321-427f-aa17-267ca6975798` if `az` is installed and `az account show` succeeds
3. Otherwise: auth-failed error from `build_error_context`

Token source constants live in `src/apm_cli/core/token_manager.py`: `ADO_APM_PAT = "ADO_APM_PAT"`, `ADO_BEARER_SOURCE = "AAD_BEARER_AZ_CLI"`.
`ADO_APM_PAT` is the env var name used by the auth flow. The AAD bearer source constant lives in `src/apm_cli/core/token_manager.py` as `GitHubTokenManager.ADO_BEARER_SOURCE = "AAD_BEARER_AZ_CLI"`.

**Stale-PAT silent fallback:** if `ADO_APM_PAT` is rejected with HTTP 401, APM retries with the az bearer and emits:

Expand Down
17 changes: 9 additions & 8 deletions .github/agents/auth-expert.agent.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,34 +23,35 @@ If a code change contradicts the mermaid diagram, the diagram (and matching doc

## Core Knowledge

- **Token prefixes**: Fine-grained PATs (`github_pat_`), classic PATs (`ghp_`), OAuth user-to-server (`ghu_` e.g. `gh auth login`), OAuth app (`gho_`), GitHub App install (`ghs_`), GitHub App refresh (`ghr_`)
- **EMU (Enterprise Managed Users)**: Use standard PAT prefixes (`ghp_`, `github_pat_`). There is NO special prefix for EMU it's a property of the account, not the token. EMU tokens are enterprise-scoped and cannot access public github.com repos. EMU orgs can exist on github.com or *.ghe.com.
- **Token prefixes**: Fine-grained PATs (`github_pat_`), classic PATs (`ghp_`), OAuth user-to-server (`ghu_` -- e.g. `gh auth login`), OAuth app (`gho_`), GitHub App install (`ghs_`), GitHub App refresh (`ghr_`)
- **EMU (Enterprise Managed Users)**: Use standard PAT prefixes (`ghp_`, `github_pat_`). There is NO special prefix for EMU -- it's a property of the account, not the token. EMU tokens are enterprise-scoped and cannot access public github.com repos. EMU orgs can exist on github.com or *.ghe.com.
- **Host classification**: github.com (public), *.ghe.com (no public repos), GHES (`GITHUB_HOST`), ADO
- **Git credential helpers**: macOS Keychain, Windows Credential Manager, `gh auth`, `git credential fill`
- **Rate limiting**: 60/hr unauthenticated, 5000/hr authenticated, primary (403) vs secondary (429)

## APM Architecture

- **AuthResolver** (`src/apm_cli/core/auth.py`): Single source of truth. Per-(host, org) resolution. Frozen `AuthContext` for thread safety.
- **Token precedence**: `GITHUB_APM_PAT_{ORG}` `GITHUB_APM_PAT` `GITHUB_TOKEN` `GH_TOKEN` `git credential fill`
- **Token precedence**: `GITHUB_APM_PAT_{ORG}` -> `GITHUB_APM_PAT` -> `GITHUB_TOKEN` -> `GH_TOKEN` -> `git credential fill`
- **Fallback chains**: unauth-first for validation (save rate limits), auth-first for download
- **GitHubTokenManager** (`src/apm_cli/core/token_manager.py`): Low-level token lookup, wrapped by AuthResolver

## Decision Framework

When reviewing or writing auth code:

1. **Every remote operation** must go through AuthResolver no direct `os.getenv()` for tokens
1. **Every remote operation** must go through AuthResolver -- no direct `os.getenv()` for tokens
2. **Per-dep resolution**: Use `resolve_for_dep(dep_ref)`, never `self.github_token` instance vars
3. **Host awareness**: Global env vars are checked for all hosts (no host-gating). `try_with_fallback()` retries with `git credential fill` if the token is rejected. HTTPS is the transport security boundary. *.ghe.com and ADO always require auth (no unauthenticated fallback).
4. **Error messages**: Always use `build_error_context()` never hardcode env var names
4. **Error messages**: Always use `build_error_context()` -- never hardcode env var names
5. **Thread safety**: AuthContext is resolved before `executor.submit()`, passed per-worker

## Common Pitfalls

- EMU PATs on public github.com repos will fail silently (you cannot detect EMU from prefix)
- EMU PATs on public github.com repos -> will fail silently (you cannot detect EMU from prefix)
- `git credential fill` only resolves per-host, not per-org
- `_build_repo_url` must accept token param, not use instance var
- Windows: `GIT_ASKPASS` must be `'echo'` not empty string
- Classic PATs (`ghp_`) work cross-org but are being deprecated — prefer fine-grained
- ADO uses Basic auth with base64-encoded `:PAT` — different from GitHub bearer token flow
- Classic PATs (`ghp_`) work cross-org but are being deprecated -- prefer fine-grained
- ADO uses Basic auth with base64-encoded `:PAT` -- different from GitHub bearer token flow
- ADO also supports AAD bearer tokens via `az account get-access-token` (resource `499b84ac-1321-427f-aa17-267ca6975798`); precedence is `ADO_APM_PAT` -> az bearer -> fail. Stale PATs (401) silently fall back to the bearer with a `[!]` warning. See the auth skill for the four diagnostic cases.
33 changes: 32 additions & 1 deletion .github/skills/auth/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ name: auth
description: >
Activate when code touches token management, credential resolution, git auth
flows, GITHUB_APM_PAT, ADO_APM_PAT, AuthResolver, HostInfo, AuthContext, or
any remote host authentication even if 'auth' isn't mentioned explicitly.
any remote host authentication -- even if 'auth' isn't mentioned explicitly.
---

# Auth Skill
Expand All @@ -26,3 +26,34 @@ All auth flows MUST go through `AuthResolver`. No direct `os.getenv()` for token
## Canonical reference

The full per-org -> global -> credential-fill -> fallback resolution flow is in [`docs/src/content/docs/getting-started/authentication.md`](../../../docs/src/content/docs/getting-started/authentication.md) (mermaid flowchart). Treat it as the single source of truth; if behavior diverges, fix the diagram in the same PR.

## Bearer-token authentication for ADO

ADO hosts (`dev.azure.com`, `*.visualstudio.com`) resolve auth in this order:

1. `ADO_APM_PAT` env var if set
2. AAD bearer via `az account get-access-token --resource 499b84ac-1321-427f-aa17-267ca6975798` if `az` is installed and `az account show` succeeds
3. Otherwise: auth-failed error from `build_error_context`

`ADO_APM_PAT` is the env var name used by the auth flow. The AAD bearer source constant lives in `src/apm_cli/core/token_manager.py` as `GitHubTokenManager.ADO_BEARER_SOURCE = "AAD_BEARER_AZ_CLI"`.

**Stale-PAT silent fallback:** if `ADO_APM_PAT` is rejected with HTTP 401, APM retries with the az bearer and emits:

```
[!] ADO_APM_PAT was rejected for {host} (HTTP 401); fell back to az cli bearer.
[!] Consider unsetting the stale variable.
```

**Verbose source line** (one per host, emitted under `--verbose`):

```
[i] dev.azure.com -- using bearer from az cli (source: AAD_BEARER_AZ_CLI)
[i] dev.azure.com -- token from ADO_APM_PAT
```

**Diagnostic cases** (`_emit_stale_pat_diagnostic` + `build_error_context` in `src/apm_cli/core/auth.py`):

1. No PAT, no `az`: `No ADO_APM_PAT was set and az CLI is not installed.` -> install `az`, run `az login --tenant <tenant>`, or set `ADO_APM_PAT`.
2. No PAT, `az` not signed in: `az CLI is installed but no active session was found.` -> run `az login --tenant <tenant>` against the tenant that owns the org, or set `ADO_APM_PAT`.
3. No PAT, wrong tenant: `az CLI returned a token but the org does not accept it (likely a tenant mismatch).` -> run `az login --tenant <correct-tenant>`, or set `ADO_APM_PAT`.
4. PAT 401, no `az` fallback: `ADO_APM_PAT was rejected (HTTP 401) and no az cli fallback was available.` -> rotate the PAT, or install `az` and run `az login --tenant <tenant>`.
42 changes: 42 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -96,3 +96,45 @@ jobs:
include-hidden-files: true
retention-days: 30
if-no-files-found: error

# Dogfood the two CI gates we ship and document to users:
# - Gate A (consumer-side): `apm audit --ci` -- lockfile / install fidelity.
# - Gate B (producer-side): regeneration drift -- did someone hand-edit
# a regenerated file under .github/ without updating canonical .apm/?
# See microsoft/apm#883 for context. Tier 1 (no secrets needed).
apm-self-check:
name: APM Self-Check
runs-on: ubuntu-24.04
permissions:
contents: read
steps:
- uses: actions/checkout@v4

# Installs the APM CLI (latest stable) and runs `apm install` against
# this repo's apm.yml. Auto-detects target from the existing .github/
# directory and re-integrates local .apm/ content, regenerating
# .github/instructions/, .github/agents/, .github/skills/, etc.
# Adds `apm` to PATH for subsequent steps.
- uses: microsoft/apm-action@v1

# Gate A: lockfile / install fidelity (consumer-side).
# Verifies every file in lockfile.deployed_files exists, ref consistency
# between apm.yml and apm.lock.yaml, no orphan packages, and
# content-integrity (hidden Unicode) on deployed package content.
# Does NOT verify deployed-file content vs lockfile (see #684).
- name: apm audit --ci
run: apm audit --ci

# Gate B: regeneration drift (producer-side).
# The action's `apm install` step re-integrated local .apm/ into
# .github/ via target auto-detection. If anything in the governed
# integration directories changed, someone edited the regenerated
# output without updating the canonical .apm/ source.
- name: Check APM integration drift
run: |
if [ -n "$(git status --porcelain -- .github/ .claude/ .cursor/ .opencode/)" ]; then
echo "::error::APM integration files are out of date."
echo "Run 'apm install' locally (with .github/ present) and commit the result."
git --no-pager diff -- .github/ .claude/ .cursor/ .opencode/
exit 1
fi
6 changes: 3 additions & 3 deletions .github/workflows/merge-gate.yml
Original file line number Diff line number Diff line change
Expand Up @@ -90,11 +90,11 @@ jobs:
SHA: ${{ steps.sha.outputs.sha }}
# All PR-time checks the gate aggregates. Keep this in sync with
# the underlying workflows. Currently only ci.yml emits PR-time
# checks ('Build & Test (Linux)'); ci-integration.yml is
# merge_group-only and is NOT polled here.
# checks ('Build & Test (Linux)', 'APM Self-Check');
# ci-integration.yml is merge_group-only and is NOT polled here.
# NOTE: 'gate' (this job) MUST NOT appear here -- it would
# deadlock waiting for itself.
EXPECTED_CHECKS: 'Build & Test (Linux)'
EXPECTED_CHECKS: 'Build & Test (Linux),APM Self-Check'
TIMEOUT_MIN: '30'
POLL_SEC: '30'
run: |
Expand Down
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

### Added

- CI: add `APM Self-Check` to `ci.yml` for `apm audit --ci`, regeneration-drift validation, and `merge-gate.yml` `EXPECTED_CHECKS` coverage. (#885)

### Changed

- CI: smoke tests in `build-release.yml`'s `build-and-test` job (Linux x86_64, Linux arm64, Windows) are now gated to promotion boundaries (tag/schedule/dispatch) instead of running on every push to main. Push-time smoke duplicated the merge-time smoke gate in `ci-integration.yml` and burned ~15 redundant codex-binary downloads/day. Tag-cut releases still run smoke as a pre-ship gate; nightly catches upstream codex URL drift; merge-time still gates merges into main. (#878)
Expand Down
4 changes: 4 additions & 0 deletions docs/src/content/docs/integrations/ci-cd.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,10 @@ To ensure `.github/`, `.claude/`, `.cursor/`, and `.opencode/` integration files

This catches cases where a developer updates `apm.yml` but forgets to re-run `apm install`.

:::tip[We dogfood this]
APM's own repo uses the `APM Self-Check` job in [`microsoft/apm`'s `ci.yml`](https://github.com/microsoft/apm/blob/main/.github/workflows/ci.yml) as a reference implementation for installing APM, running CI validation commands such as `apm audit --ci`, and checking for drift with `git status --porcelain`. Use it as a practical example when wiring these checks into your own workflow.
:::

## Azure Pipelines

```yaml
Expand Down
Loading