Skip to content

Commit 690db38

Browse files
placerdaCopilot
andauthored
docs(skill): require Cognitive Services OpenAI User as prereq RBAC role (#203) (#204)
Foundry `azure_ai_evaluator` graders impersonate the OIDC principal to call OpenAI; without `Cognitive Services OpenAI User` on the underlying AI Services account the graders fail with a 401 PermissionDenied and every cloud eval metric returns null. Verified end-to-end on placerda/agentops-prompt-quickstart: after granting the role, the first PR run goes green from scratch. - agentops-workflow SKILL.md: pre-dispatch checks now list both Foundry User (Foundry project) AND Cognitive Services OpenAI User (AI Services account), with role ids and az role assignment create commands for each. - tutorial-prompt-agent-quickstart.md: step 12's Copilot prompt and the workflow-skill walkthrough list both roles. - tutorial-end-to-end.md: both workflow-skill prompts list both roles. - docs/ci-github-actions.md: prerequisite section lists both roles with the OpenAI graders' failure mode spelled out. - plugins/agentops/skills/agentops-workflow/SKILL.md: synced from src/. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1 parent 723f7a8 commit 690db38

6 files changed

Lines changed: 160 additions & 49 deletions

File tree

CHANGELOG.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,20 @@ This format follows [Keep a Changelog](https://keepachangelog.com/) and adheres
55

66
## [Unreleased]
77

8+
### Changed
9+
- **Skill + tutorial guidance now require `Cognitive Services OpenAI User` as a prerequisite RBAC role.**
10+
The `agentops-workflow` skill, `tutorial-prompt-agent-quickstart.md`,
11+
`tutorial-end-to-end.md`, and `docs/ci-github-actions.md` now instruct users
12+
to grant the OIDC/CI service principal **both** Foundry User on the Foundry
13+
project **and** Cognitive Services OpenAI User on the underlying Azure AI
14+
Services account that hosts the evaluator model deployment. Foundry
15+
`azure_ai_evaluator` graders impersonate the OIDC principal to call OpenAI;
16+
without the OpenAI User role they fail with a 401 `PermissionDenied` and
17+
every cloud eval metric returns `null`, blocking the first PR run. The skill
18+
now emits the matching `az role assignment create` commands for both roles
19+
(role ids `53ca6127-db72-4b80-b1b0-d745d6d5456d` and
20+
`5e0bd9bd-7b93-4f28-af87-19fc36ad61bd`) before dispatching the workflow.
21+
822
### Fixed
923
- **Cloud eval surfaces grader execution errors instead of silent nulls.**
1024
When a Foundry `azure_ai_evaluator` grader fails to execute (most

docs/ci-github-actions.md

Lines changed: 17 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -125,9 +125,23 @@ from GitHub Actions runs. See
125125
[Microsoft's WIF docs](https://learn.microsoft.com/azure/active-directory/workload-identities/workload-identity-federation-create-trust?pivots=identity-wif-apps-methods-azp).
126126

127127
For Foundry prompt-agent gates, the same app registration / service principal
128-
also needs **Foundry User** on the Foundry project or Foundry resource. Azure
129-
`Reader` is not enough because the eval step calls Foundry data-plane APIs such
130-
as `Microsoft.CognitiveServices/accounts/AIServices/agents/read`.
128+
needs **two** Azure RBAC roles before the first workflow run. Both are required
129+
and the eval step fails silently (every metric returns `null`) if only one is
130+
in place:
131+
132+
- **Foundry User** on the Foundry project or Foundry resource. Azure `Reader`
133+
is not enough because the eval step calls Foundry data-plane APIs such as
134+
`Microsoft.CognitiveServices/accounts/AIServices/agents/read`.
135+
- **Cognitive Services OpenAI User** on the underlying Azure AI Services
136+
account that hosts the evaluator model deployment. Foundry `azure_ai_evaluator`
137+
graders impersonate the OIDC principal to call OpenAI; without this role
138+
they fail with a 401 `PermissionDenied` on
139+
`Microsoft.CognitiveServices/accounts/OpenAI/deployments/chat/completions/action`
140+
and every metric returns `null` in the cloud eval report. AgentOps lifts that
141+
error into `results.json` and the orchestrator's "0 usable metric scores"
142+
warning so you can see the cause in CI logs, but the workflow still fails the
143+
gate. The role ids are `53ca6127-db72-4b80-b1b0-d745d6d5456d` (Foundry User)
144+
and `5e0bd9bd-7b93-4f28-af87-19fc36ad61bd` (Cognitive Services OpenAI User).
131145

132146
The generated eval and doctor workflows install AgentOps telemetry support.
133147
When `AZURE_AI_FOUNDRY_PROJECT_ENDPOINT` is set, AgentOps first tries to

docs/tutorial-end-to-end.md

Lines changed: 12 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -418,8 +418,11 @@ this Foundry prompt-agent repo.
418418
Create or connect the GitHub repo if needed, create the `dev` environment, wire
419419
Azure OIDC, set AZURE_OPENAI_DEPLOYMENT=gpt-4o-mini as a GitHub `dev`
420420
environment variable or equivalent Azure DevOps pipeline variable, verify the
421-
OIDC principal has Foundry User access, and show me the plan before changing
422-
GitHub or Azure.
421+
OIDC principal has **both** Foundry User access on the dev Foundry project
422+
**and** Cognitive Services OpenAI User access on the underlying Azure AI
423+
Services account that hosts the evaluator model (both are required — without
424+
the OpenAI User role, every cloud eval metric returns null), and show me the
425+
plan before changing GitHub or Azure.
423426
```
424427

425428
That value is not an `agentops init` answer. It tells the Foundry cloud eval
@@ -568,10 +571,13 @@ workflows running for this Foundry agent repo.
568571
569572
Extend the PR/dev setup if it already exists, wire Azure OIDC for the `qa` and
570573
`production` environments, confirm required Actions variables such as
571-
AZURE_OPENAI_DEPLOYMENT, verify the OIDC principals have Foundry User access,
572-
and keep deploy placeholders unless this repo already has an azd deployment
573-
path. Show me the plan before changing GitHub or Azure, and call out anything
574-
that needs owner/admin permission.
574+
AZURE_OPENAI_DEPLOYMENT, verify the OIDC principals have **both** Foundry User
575+
access on each Foundry project **and** Cognitive Services OpenAI User on the
576+
underlying AI Services account hosting the evaluator model (both are required
577+
— without the OpenAI User role, every cloud eval metric returns null), and
578+
keep deploy placeholders unless this repo already has an azd deployment path.
579+
Show me the plan before changing GitHub or Azure, and call out anything that
580+
needs owner/admin permission.
575581
```
576582

577583
Use this moment in the video to connect the four repos: Foundry Toolkit creates

docs/tutorial-prompt-agent-quickstart.md

Lines changed: 19 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -569,9 +569,12 @@ This may be a brand-new folder with no Git repo or GitHub remote yet.
569569
Keep the scope to the PR gate and dev deploy only: create or connect the
570570
GitHub repo if needed, wire Azure OIDC and required Actions
571571
variables/secrets, create only the `dev` environment, verify the OIDC
572-
principal has Foundry User access on the **dev** Foundry project, and
573-
do not set up `qa`, `production`, scheduled Doctor, or hosted
574-
deployment workflows yet.
572+
principal has **both** Foundry User access on the **dev** Foundry project
573+
**and** Cognitive Services OpenAI User on the underlying Azure AI Services
574+
account that hosts the evaluator model (both roles are required — without
575+
the OpenAI User role, the Foundry cloud graders fail with a 401 and every
576+
metric comes back null), and do not set up `qa`, `production`, scheduled
577+
Doctor, or hosted deployment workflows yet.
575578

576579
The dev Foundry project endpoint is in `.azure/dev/.env`; the sandbox
577580
endpoint is local-only and must not be added to CI.
@@ -589,9 +592,19 @@ it skips:
589592
- Set Actions variables `AZURE_TENANT_ID`, `AZURE_SUBSCRIPTION_ID`,
590593
`AZURE_CLIENT_ID`, `AZURE_AI_FOUNDRY_PROJECT_ENDPOINT` (the dev
591594
endpoint), and `APPLICATIONINSIGHTS_CONNECTION_STRING` if available.
592-
- Verify the OIDC principal has **Foundry User** access on the dev
593-
Foundry project. Reader alone is not enough for the data-plane calls
594-
the prompt-agent staging and eval steps make.
595+
- Verify the OIDC principal has **two** Azure RBAC roles before the first
596+
run. Both are required and the eval step fails silently (every metric
597+
returns `null`) if only one is in place:
598+
- **Foundry User** on the dev Foundry project — Reader alone is not
599+
enough for the data-plane calls the prompt-agent staging and eval steps
600+
make.
601+
- **Cognitive Services OpenAI User** on the underlying Azure AI Services
602+
account that hosts the evaluator model deployment. Foundry
603+
`azure_ai_evaluator` graders impersonate the OIDC principal to call
604+
OpenAI; without this role they fail with a 401 `PermissionDenied`. The
605+
AgentOps cloud-results parser lifts that error into `results.json` so
606+
you can see the cause in the artifact, but the workflow still fails
607+
the gate.
595608

596609
## 13. First green PR → merge → dev deploy
597610

plugins/agentops/skills/agentops-workflow/SKILL.md

Lines changed: 49 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -100,22 +100,40 @@ by discovering the whole Azure subscription.
100100
`repo:<owner>/<repo>:environment:dev`. Do not assume branch or
101101
`pull_request` subjects without reading the workflow.
102102
9. Before triggering a Foundry prompt-agent workflow, make sure the OIDC app /
103-
service principal has Foundry data-plane access. It needs **Foundry User**
104-
(role id `53ca6127-db72-4b80-b1b0-d745d6d5456d`, formerly Azure AI User) at
105-
the Foundry project scope, or at the Foundry resource scope if that is the
106-
team's standard. Azure **Reader** is not enough; without this role the eval
107-
step fails on
108-
`Microsoft.CognitiveServices/accounts/AIServices/agents/read`.
109-
10. If the Foundry RBAC assignment is missing, do not run the workflow yet.
110-
Show the exact GitHub OIDC client ID / service principal, desired role, and
111-
target Foundry scope, then ask the user to approve the role assignment or
103+
service principal has **two** RBAC assignments. Both are required; the eval
104+
step fails silently (every metric returns `null`) if only one is in place.
105+
1. **Foundry User** on the Foundry project (or the Foundry resource scope
106+
if that is the team's standard). Role id
107+
`53ca6127-db72-4b80-b1b0-d745d6d5456d` (formerly Azure AI User). Without
108+
this the candidate-staging step fails on
109+
`Microsoft.CognitiveServices/accounts/AIServices/agents/read`.
110+
2. **Cognitive Services OpenAI User** on the underlying Azure AI Services
111+
account that hosts the evaluator model deployment
112+
(typically the parent account of the Foundry project). Role id
113+
`5e0bd9bd-7b93-4f28-af87-19fc36ad61bd`. Without this the Foundry
114+
`azure_ai_evaluator` graders fail with a 401 `PermissionDenied` on
115+
`Microsoft.CognitiveServices/accounts/OpenAI/deployments/chat/completions/action`
116+
and every metric comes back `null` in the cloud eval report. AgentOps now
117+
lifts that error into `results.json` and the orchestrator's "0 usable
118+
metric scores" warning so the cause is visible in CI logs, but the
119+
workflow still fails the gate. Grant this role **before** the first run.
120+
Azure **Reader** is not enough for either step.
121+
10. If either RBAC assignment is missing, do not run the workflow yet.
122+
Show the exact GitHub OIDC client ID / service principal, desired role,
123+
target scope (project for Foundry User, AI Services account for Cognitive
124+
Services OpenAI User), then ask the user to approve the role assignment or
112125
get an Azure/Foundry admin to grant it. After assignment, read it back or ask
113126
the user to confirm before dispatching the workflow.
114-
When the user approves and you know the Foundry scope, use the role id to
115-
avoid rename drift:
127+
When the user approves and you know the scopes, use the role ids to avoid
128+
rename drift:
116129
- `az ad sp show --id <AZURE_CLIENT_ID> --query id -o tsv`
117130
- `az role assignment list --assignee <sp-object-id> --scope <foundry-scope> --include-inherited`
118131
- `az role assignment create --assignee-object-id <sp-object-id> --assignee-principal-type ServicePrincipal --role 53ca6127-db72-4b80-b1b0-d745d6d5456d --scope <foundry-scope>`
132+
- `az role assignment create --assignee-object-id <sp-object-id> --assignee-principal-type ServicePrincipal --role 5e0bd9bd-7b93-4f28-af87-19fc36ad61bd --scope <ai-services-account-scope>`
133+
The AI Services account scope looks like
134+
`/subscriptions/<sub>/resourceGroups/<rg>/providers/Microsoft.CognitiveServices/accounts/<ai-account-name>`
135+
and can be derived from
136+
`az cognitiveservices account list --resource-group <foundry-project-rg> --query "[?kind=='AIServices'].id" -o tsv`.
119137
11. Ask before creating or updating GitHub repos, GitHub environments,
120138
variables/secrets, Entra app registrations/service principals, federated
121139
credentials, managed identities, or Azure RBAC assignments.
@@ -304,11 +322,21 @@ Then configure Workload Identity Federation on the Azure side
304322
environment** the workflows will run from. See
305323
`docs/ci-github-actions.md` for the exact `az` commands.
306324

307-
Also grant the same app registration / service principal **Foundry User** on the
308-
Foundry project or Foundry resource before the first workflow run. The PR gate
309-
uses Foundry data-plane APIs to read prompt agents; Azure `Reader` only proves
310-
ARM access and will still fail the eval step with
311-
`Microsoft.CognitiveServices/accounts/AIServices/agents/read`.
325+
Also grant the same app registration / service principal **two** Azure
326+
RBAC roles before the first workflow run; both are required and the eval
327+
step fails silently (every metric returns `null`) if only one is in place:
328+
329+
1. **Foundry User** on the Foundry project or Foundry resource. The PR gate
330+
uses Foundry data-plane APIs to read prompt agents; Azure `Reader` only
331+
proves ARM access and will still fail the eval step with
332+
`Microsoft.CognitiveServices/accounts/AIServices/agents/read`.
333+
2. **Cognitive Services OpenAI User** on the underlying Azure AI Services
334+
account that hosts the evaluator model deployment. Without this, Foundry
335+
`azure_ai_evaluator` graders fail with a 401 `PermissionDenied` on the
336+
OpenAI `chat/completions/action` data action and every metric returns
337+
`null` in the cloud eval report. AgentOps surfaces that error in
338+
`results.json` and the orchestrator's "0 usable metric scores" warning,
339+
but the workflow still fails the gate — fix the role before the run.
312340

313341
Tell the user that CI evals emit `agentops.eval.*` telemetry and scheduled
314342
Doctor runs emit `agentops.agent.finding.*` telemetry when App Insights is
@@ -319,7 +347,11 @@ Monitor deep links.
319347

320348
Already done in Step 2 - the `agentops-azure` service connection
321349
handles auth. Make sure the underlying service principal or managed
322-
identity has the **Foundry User** role on the Foundry project or resource.
350+
identity has **both** the **Foundry User** role on the Foundry project (or
351+
Foundry resource) **and** the **Cognitive Services OpenAI User** role on the
352+
underlying Azure AI Services account that hosts the evaluator model. Both
353+
are required; without the OpenAI User role the Foundry graders fail with a
354+
401 `PermissionDenied` and every cloud eval metric returns `null`.
323355

324356
## Step 4 - Use azd for deployment
325357

src/agentops/templates/skills/agentops-workflow/SKILL.md

Lines changed: 49 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -100,22 +100,40 @@ by discovering the whole Azure subscription.
100100
`repo:<owner>/<repo>:environment:dev`. Do not assume branch or
101101
`pull_request` subjects without reading the workflow.
102102
9. Before triggering a Foundry prompt-agent workflow, make sure the OIDC app /
103-
service principal has Foundry data-plane access. It needs **Foundry User**
104-
(role id `53ca6127-db72-4b80-b1b0-d745d6d5456d`, formerly Azure AI User) at
105-
the Foundry project scope, or at the Foundry resource scope if that is the
106-
team's standard. Azure **Reader** is not enough; without this role the eval
107-
step fails on
108-
`Microsoft.CognitiveServices/accounts/AIServices/agents/read`.
109-
10. If the Foundry RBAC assignment is missing, do not run the workflow yet.
110-
Show the exact GitHub OIDC client ID / service principal, desired role, and
111-
target Foundry scope, then ask the user to approve the role assignment or
103+
service principal has **two** RBAC assignments. Both are required; the eval
104+
step fails silently (every metric returns `null`) if only one is in place.
105+
1. **Foundry User** on the Foundry project (or the Foundry resource scope
106+
if that is the team's standard). Role id
107+
`53ca6127-db72-4b80-b1b0-d745d6d5456d` (formerly Azure AI User). Without
108+
this the candidate-staging step fails on
109+
`Microsoft.CognitiveServices/accounts/AIServices/agents/read`.
110+
2. **Cognitive Services OpenAI User** on the underlying Azure AI Services
111+
account that hosts the evaluator model deployment
112+
(typically the parent account of the Foundry project). Role id
113+
`5e0bd9bd-7b93-4f28-af87-19fc36ad61bd`. Without this the Foundry
114+
`azure_ai_evaluator` graders fail with a 401 `PermissionDenied` on
115+
`Microsoft.CognitiveServices/accounts/OpenAI/deployments/chat/completions/action`
116+
and every metric comes back `null` in the cloud eval report. AgentOps now
117+
lifts that error into `results.json` and the orchestrator's "0 usable
118+
metric scores" warning so the cause is visible in CI logs, but the
119+
workflow still fails the gate. Grant this role **before** the first run.
120+
Azure **Reader** is not enough for either step.
121+
10. If either RBAC assignment is missing, do not run the workflow yet.
122+
Show the exact GitHub OIDC client ID / service principal, desired role,
123+
target scope (project for Foundry User, AI Services account for Cognitive
124+
Services OpenAI User), then ask the user to approve the role assignment or
112125
get an Azure/Foundry admin to grant it. After assignment, read it back or ask
113126
the user to confirm before dispatching the workflow.
114-
When the user approves and you know the Foundry scope, use the role id to
115-
avoid rename drift:
127+
When the user approves and you know the scopes, use the role ids to avoid
128+
rename drift:
116129
- `az ad sp show --id <AZURE_CLIENT_ID> --query id -o tsv`
117130
- `az role assignment list --assignee <sp-object-id> --scope <foundry-scope> --include-inherited`
118131
- `az role assignment create --assignee-object-id <sp-object-id> --assignee-principal-type ServicePrincipal --role 53ca6127-db72-4b80-b1b0-d745d6d5456d --scope <foundry-scope>`
132+
- `az role assignment create --assignee-object-id <sp-object-id> --assignee-principal-type ServicePrincipal --role 5e0bd9bd-7b93-4f28-af87-19fc36ad61bd --scope <ai-services-account-scope>`
133+
The AI Services account scope looks like
134+
`/subscriptions/<sub>/resourceGroups/<rg>/providers/Microsoft.CognitiveServices/accounts/<ai-account-name>`
135+
and can be derived from
136+
`az cognitiveservices account list --resource-group <foundry-project-rg> --query "[?kind=='AIServices'].id" -o tsv`.
119137
11. Ask before creating or updating GitHub repos, GitHub environments,
120138
variables/secrets, Entra app registrations/service principals, federated
121139
credentials, managed identities, or Azure RBAC assignments.
@@ -304,11 +322,21 @@ Then configure Workload Identity Federation on the Azure side
304322
environment** the workflows will run from. See
305323
`docs/ci-github-actions.md` for the exact `az` commands.
306324

307-
Also grant the same app registration / service principal **Foundry User** on the
308-
Foundry project or Foundry resource before the first workflow run. The PR gate
309-
uses Foundry data-plane APIs to read prompt agents; Azure `Reader` only proves
310-
ARM access and will still fail the eval step with
311-
`Microsoft.CognitiveServices/accounts/AIServices/agents/read`.
325+
Also grant the same app registration / service principal **two** Azure
326+
RBAC roles before the first workflow run; both are required and the eval
327+
step fails silently (every metric returns `null`) if only one is in place:
328+
329+
1. **Foundry User** on the Foundry project or Foundry resource. The PR gate
330+
uses Foundry data-plane APIs to read prompt agents; Azure `Reader` only
331+
proves ARM access and will still fail the eval step with
332+
`Microsoft.CognitiveServices/accounts/AIServices/agents/read`.
333+
2. **Cognitive Services OpenAI User** on the underlying Azure AI Services
334+
account that hosts the evaluator model deployment. Without this, Foundry
335+
`azure_ai_evaluator` graders fail with a 401 `PermissionDenied` on the
336+
OpenAI `chat/completions/action` data action and every metric returns
337+
`null` in the cloud eval report. AgentOps surfaces that error in
338+
`results.json` and the orchestrator's "0 usable metric scores" warning,
339+
but the workflow still fails the gate — fix the role before the run.
312340

313341
Tell the user that CI evals emit `agentops.eval.*` telemetry and scheduled
314342
Doctor runs emit `agentops.agent.finding.*` telemetry when App Insights is
@@ -319,7 +347,11 @@ Monitor deep links.
319347

320348
Already done in Step 2 - the `agentops-azure` service connection
321349
handles auth. Make sure the underlying service principal or managed
322-
identity has the **Foundry User** role on the Foundry project or resource.
350+
identity has **both** the **Foundry User** role on the Foundry project (or
351+
Foundry resource) **and** the **Cognitive Services OpenAI User** role on the
352+
underlying Azure AI Services account that hosts the evaluator model. Both
353+
are required; without the OpenAI User role the Foundry graders fail with a
354+
401 `PermissionDenied` and every cloud eval metric returns `null`.
323355

324356
## Step 4 - Use azd for deployment
325357

0 commit comments

Comments
 (0)