Skip to content

ARM API Reviewer agent: detect and flag denylist (negated character class) regex patterns in pattern constraints #43384

@ravimeda

Description

@ravimeda

Background

PR #43353 revealed that the ARM API Reviewer agent has no rule covering denylist regex patterns in pattern constraints. The PR changes the Azure Policy resource name pattern from ^[^<>*%&:\\?.+/]*[^<>*%&:\\?.+/ ]+$ to ^[^<>%&:\\?/]*[^<>%&:\\?/ ]+$, which replaces one denylist with a slightly more permissive denylist.

A denylist pattern uses negated character classes ([^...]) to specify characters that are not allowed, rather than specifying characters that are allowed. Since JSON is UTF-8 and the regex engine operates on Unicode, a denylist will match any Unicode character not explicitly listed in the exclusion set. This means strings containing emoji, control characters, null bytes, newlines, and non-Latin scripts will all pass validation, even though the service almost certainly rejects them.

A scan of the azure-rest-api-specs repo found 26+ services using denylist patterns.

Concrete example of the problem:

Input Denylist ^[^<>%&:\\?/]+$ Allowlist ^[A-Za-z0-9_.-]+$
my-policy_v1 Matches Matches
my🎉policy (emoji) Matches Does not match
policy\x00name (null byte) Matches Does not match
policy\tname (tab) Matches Does not match
політика (Cyrillic) Matches Does not match

The existing instruction files recommend allowlist-style patterns by example, but they never state that denylist patterns are prohibited. The agent has no rule ID for this, no detection guidance, and no severity policy that distinguishes new properties from existing ones.

Severity policy

The severity should differ based on whether the property or parameter existed in the previous API version:

  • New property/parameter (not present in the previous API version) with a denylist pattern: Blocking. There is no backward-compatibility concern, so the spec must use an allowlist from the start.
  • Existing property/parameter (carried forward from a previous version) with a denylist pattern: Warning. Switching from a denylist to an allowlist could be a regression risk if existing resources were created with Unicode characters that the denylist permitted. The service team should be advised to migrate to an allowlist, but the PR should not be blocked on this change alone.

Current coverage (gap analysis)

The following instruction and skill files were audited. None contain an explicit prohibition of denylist patterns:

Partial coverage (recommends allowlist by example, but does not prohibit denylist):

No coverage:

  • No rule ID exists for this (e.g., OAPI-PATTERN-ALLOWLIST or equivalent).
  • No reference file exists under .github/skills/azure-api-review/references/ for pattern validation.
  • Review checklists verify that a pattern exists but do not verify that it uses allowlist style.
  • Scope is limited to path parameters. Denylist patterns on body properties, query parameters, and TypeSpec NamePattern values are not addressed.
  • No detection guidance for the agent (how to identify [^...] as the primary matching construct).
  • No new-vs-existing severity differentiation.

References

Requested work

  1. Create a reference file at .github/skills/azure-api-review/references/pattern-validation.md covering:
    • Rule ID: OAPI-PATTERN-ALLOWLIST (or equivalent).
    • Definition of a denylist pattern: a pattern constraint where the primary character-matching construct is a negated character class ([^...]). A negative lookahead ((?!...)) used alongside a positive character class is acceptable (e.g., (?![.-])[A-Za-z0-9_.-]+).
    • Why denylist patterns are problematic: Unicode/UTF-8 allows unintended characters through.
    • Scope: all pattern constraints, including path parameters, query parameters, body string properties, TypeSpec @pattern decorators, and NamePattern in TypeSpec ARM resource definitions.
    • Severity matrix: Blocking for new properties/parameters; Warning for existing ones.
    • Fix guidance: replace the denylist with an allowlist derived from the service's actual server-side validation logic. If the exact allowlist is unknown, the service team must determine it from their validation code.
    • Regression risk note for existing properties: switching to an allowlist may reject values that existing resources were created with.
    • Examples: good patterns vs. bad patterns with explanations.
  2. Update openapi-review.instructions.md Section 4 (after line 155):
    • Add a rule stating that pattern constraints MUST use allowlist (positive character class) syntax and MUST NOT use denylist (negated character class) syntax. Reference the new pattern-validation.md file.
    • State the severity: Blocking for new properties/parameters, Warning for existing ones.
    • Add a checklist item to the Review Checklist Summary (around line 638).
  3. Update typespec-review.instructions.md Section 2.2 (after line 161):
    • Add equivalent guidance for @pattern decorator values.
    • Add a checklist item to the TypeSpec Review Checklist Summary.
  4. Update armapi-review.instructions.md Section 21.4 CNA-004:
    • Strengthen the existing guidance to explicitly reject denylist patterns in the name field.
    • Add a checklist item to the ARM Review Checklist Summary.
  5. Update .github/skills/azure-api-review/SKILL.md:
    • Add the new reference to the reference table.
  6. Update linter-rule-coverage.md:
    • Note that no automated linter rule currently exists for this check. Consider filing a follow-up issue for a linter rule in @azure-tools/typespec-azure-core or the OpenAPI validator packages.
  7. Add eval fixtures under .github/skills/evals/arm-api-reviewer/fixtures/:
    • An OpenAPI JSON fixture with a seeded denylist pattern on a path parameter (should flag as blocking) and a positive control with a correct allowlist pattern (should not flag).
    • A TypeSpec fixture with a seeded denylist @pattern or NamePattern (should flag) and a positive control (should not flag).
    • For each fixture, include both a "new property" variant (blocking) and an "existing property" variant (warning) to validate the severity differentiation.
  8. Add stimuli to the eval YAML configuration for the new fixtures, following the pattern established in existing eval entries.
  9. Update bookkeeping in .vally.yaml, README.md, fixtures/README.md, and run-evals.ps1 to reflect the new fixture and stimulus counts.
  10. Run the full eval suite locally and confirm that all new stimuli pass before opening the PR.

Acceptance criteria

  • A new reference file pattern-validation.md exists under .github/skills/azure-api-review/references/ with the full rule definition, severity matrix, detection guidance, and examples.
  • The instruction files (openapi-review.instructions.md, typespec-review.instructions.md, armapi-review.instructions.md) each contain an explicit rule prohibiting denylist patterns, with a cross-reference to the new reference file.
  • Each file's review checklist summary includes a checklist item for pattern style validation.
  • The agent, when reviewing a PR that introduces a new path parameter or property with a denylist pattern (e.g., [^<>%&]), produces a Blocking finding citing OAPI-PATTERN-ALLOWLIST.
  • The agent, when reviewing a PR that carries forward an existing denylist pattern from a prior API version, produces a Warning finding (not Blocking).
  • The agent does not flag allowlist patterns that include a negative lookahead (e.g., (?![.-])[A-Za-z0-9_.-]+).
  • The eval suite contains at least one stimulus per fixture, and all new stimuli pass locally.
  • The PR does not modify any service specification/** files, language emitters, or eng/ content. Scope stays within agent instructions, skill references, eval fixtures, and eval configuration.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No fields configured for Task.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions