Skip to content

feat: add Jira ticket-context support (provider-agnostic)#2

Draft
dellch wants to merge 9 commits into
mainfrom
feat/jira-ticket-context
Draft

feat: add Jira ticket-context support (provider-agnostic)#2
dellch wants to merge 9 commits into
mainfrom
feat/jira-ticket-context

Conversation

@dellch

@dellch dellch commented Jun 2, 2026

Copy link
Copy Markdown
Owner

Draft on my fork for self-review before raising upstream (relates to The-PR-Agent#2162). Not yet a PR against the upstream repo.

Summary

Adds Jira ticket-context support to the review tool's ticket-analysis flow (require_ticket_analysis_review). Previously only GitHub issues and Azure Boards work items were fetched; teams tracking work in Jira got no ticket context.

The lookup is provider-agnostic — it only uses get_user_description() and get_pr_branch() (plus the PR title), which every git provider implements — so Jira context works on GitHub, GitLab, Bitbucket, Azure DevOps, etc. It is a no-op when [jira] is not configured.

What it does

  • find_jira_tickets(): extract Jira keys from PR title, description and branch name (case-insensitive, normalized to upper case so bugfix/abc-123-x is detected).
  • extract_jira_tickets() / _get_jira_client(): fetch each key via the atlassian-python-api Jira client (already a dependency; used by the Bitbucket providers). Supports Jira Cloud (email + token) and Server/DC (PAT).
  • add_jira_tickets(): provider-agnostic step in extract_tickets() that appends Jira tickets to tickets_content, de-duplicated by URL.
  • jira_requirements_field: maps an instance-specific custom field (e.g. customfield_10127) to the ticket "requirements" section (acceptance criteria).
  • MAX_TICKETS / MAX_TICKET_CHARACTERS module constants, reused across the file.- Unit tests, plus a docs update.

Verified end-to-end against a live Jira Cloud instance (v2 API returns the fields asplain strings — no ADF parsing needed; cap + skip-on-404 matches the existing GitHubextraction behavior).

Open questions (for the maintainer discussion)

I am leaving these for maintainer input rather than choosing the API shape in this PR:

  1. Config key naming. This PR follows the key names the docs already publish ([jira] jira_api_token, jira_api_email, jira_base_url). Because they sit under the [jira] table, settings resolve to JIRA.JIRA_API_TOKEN and the env override is jira__jira_api_token — the repeated jira_ is redundant. Options: keep as-is for compatibility with the currently documented keys, or drop the prefix (api_token, etc., like [bitbucket_server]) for clarity at the cost of changing the documented contract.

  2. A dedicated issue-provider config section. Rather than a [jira] section, would you prefer something like [related_issues] with issue_provider = "jira"? That gives a natural home for future, provider-neutral options (see below) instead of accreting keys under [jira]. This would change the configuration shape, so worth deciding before it ships.

  3. Scope of this PR vs. follow-ups. This PR is intentionally minimal (fetch + map + tests). Should any of the following be in this PR, or separate follow-ups?

    • jira_project_keysJira-specific: restrict key matching to configured project prefixes (e.g. DVT, ABC), reducing false positives like utf-8 / sha-1. Doesn't apply to GitHub (numeric #123 refs) or Azure DevOps (native linked work items, no text extraction), so it's named to make the binding explicit. (This differs from Jira issue provider support (minimal) The-PR-Agent/pr-agent#2162, where valid_project_keys filters JQL search results; here it would filter keys parsed from PR text before fetch.)
    • max_tickets_to_fetch and similar tunables (currently the fixed MAX_TICKETS = 3, matching the existing GitHub path).
    • The broader Issue / IssueProvider abstraction from Jira issue provider support (minimal) The-PR-Agent/pr-agent#2162 — left out here as it may be premature abstraction with only one consumer.
  4. False-positive handling. Key extraction is heuristic; non-existent keys 404 and are skipped, and candidates are capped at 3 — the same trade-off the existing GitHub #123 extraction already makes. Is the current fetch-and-skip behavior acceptable for the first PR, or should strict project-key filtering (option 3's jira_project_keys) be required before merge?

Out of scope

/similar_issue, embedding client, vector-DB integration, and the issue-provider abstraction layer (all part of The-PR-Agent#2162) are intentionally excluded.

Test plan

  • pytest tests/unittest/test_jira_ticket_extraction.py (25 tests)
  • No regressions in test_extract_issue_from_branch.py
  • Live validation against a Jira Cloud instance

🤖 Generated with Claude Code

@dellch dellch force-pushed the feat/jira-ticket-context branch from afa4a5c to f50891d Compare June 2, 2026 19:43
Comment thread pr_agent/tools/ticket_pr_compliance_check.py Outdated
Comment thread docs/docs/core-abilities/fetching_ticket_context.md
@dellch

dellch commented Jun 2, 2026

Copy link
Copy Markdown
Owner Author

PR description wording suggestions for the upstream version:

  • The Open Questions are clear. I would keep them, but change “flagging rather than deciding” to “I am leaving these for maintainer input rather than choosing the API shape in this PR.” It is a little more direct.
  • In question 1, “doc-compatibility” could read as only documentation compatibility. Consider “compatibility with the currently documented keys” instead.
  • In question 3, the valid_project_keys note is useful but dense. Suggested wording: “This differs from Jira issue provider support (minimal) The-PR-Agent/pr-agent#2162, where valid_project_keys filters JQL search results; here it would filter keys parsed from PR text before fetch.”
  • In question 4, consider naming the default trade-off explicitly: “Is the current fetch-and-skip behavior acceptable for the first PR, or should strict project-key filtering be required before merge?”

@dellch dellch force-pushed the feat/jira-ticket-context branch from f50891d to 1c0adaf Compare June 2, 2026 19:54
Comment thread pr_agent/settings/configuration.toml Outdated
Comment thread docs/docs/core-abilities/fetching_ticket_context.md
Comment thread docs/docs/core-abilities/fetching_ticket_context.md Outdated
@dellch dellch force-pushed the feat/jira-ticket-context branch 2 times, most recently from 27a688e to f25f0d2 Compare June 2, 2026 20:02

@dellch dellch left a comment

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed at f25f0d2. Ran the new suite (25 passed) and test_extract_issue_from_branch.py (12 passed, no regressions).

The refactor is sound. I checked against main: the old GitHub path returned inside if tickets:, and it now falls through to a single return tickets_content at the end, so the GitHub/Azure output is preserved and the new Jira step runs for every provider as intended. The only behavioural change I found is GitHub-with-no-tickets now returns [] instead of None — both falsy at the call site, so no effect.

On the dependency: atlassian-python-api==3.41.4 is already pinned in requirements.txt (the Bitbucket providers import from it), so the new top-level from atlassian import Jira adds nothing new and is safe in this core module.

Three small inline comments below; none touch the open questions. The one worth acting on is the uncapped requirements field.

Open questions - wording feedback

The framing (flag, don't decide) is good and Q2/Q4 read clearly. Two suggestions:

  • Q3, the jira_project_keys bullet is the densest of the set. "...so it's named to make the binding explicit" is vague - "the binding" is never defined. Suggested rewrite: "...so the name jira_project_keys makes explicit that it only governs Jira key extraction, not the GitHub or Azure DevOps paths." Also consider splitting out the #2162 valid_project_keys parenthetical into its own sentence: the bullet currently packs three ideas (what it does, why it is Jira-only, how it differs from The-PR-Agent#2162) into one.
  • Q1 is accurate but long. Optional tighten: lead with the decision - "Keep the documented jira_* keys, or drop the redundant prefix (api_token, etc.)?" - and keep the JIRA.JIRA_API_TOKEN / jira__jira_api_token detail as the supporting sentence.

I deliberately left the open-question topics alone in the inline comments: config-key naming (Q1), the [jira] vs [related_issues] section (Q2), the MAX_TICKETS / jira_project_keys tunables (Q3), and false-positive matching (Q4).

Comment thread pr_agent/tools/ticket_pr_compliance_check.py

fields = issue.get("fields", {}) or {}
body = fields.get("description") or ""
if not isinstance(body, str):

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor / forward-looking. On Jira Cloud the REST v3 API returns description (and rich-text custom fields) as ADF JSON - a dict, not a string. _get_jira_client() doesn't pass api_version, so atlassian-python-api defaults to v2 and you get plain strings today, which is why this works (and matches what the PR description verified).

If anyone later switches to v3, this isinstance guard silently drops the body to "". A one-line code comment noting the v2 assumption would stop a future reader from reading the empty body as a bug.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in b7e528a. Added a code comment above the description lookup noting the client defaults to REST v2 (plain strings) and that v3 would return ADF JSON dicts the isinstance guards would turn into empty strings — so a future reader sees the assumption rather than reading the empty body as a bug.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Followed up in 9c22eea: rather than rely on the library default + a comment, _get_jira_client() now passes api_version="2" explicitly for both auth modes, with tests asserting it. Confirmed atlassian-python-api currently hardcodes api_version="2" when unset, so this just pins the contract so a future dependency bump can't silently switch us to v3 (ADF dicts) and empty the bodies.

Comment thread tests/unittest/test_jira_ticket_extraction.py Outdated
@dellch dellch force-pushed the feat/jira-ticket-context branch from f25f0d2 to b7e528a Compare June 2, 2026 20:24
Comment thread pr_agent/tools/ticket_pr_compliance_check.py Outdated
@dellch dellch force-pushed the feat/jira-ticket-context branch from 9c22eea to ebc7cff Compare June 2, 2026 20:35
The review tool's ticket-analysis flow (require_ticket_analysis_review)
only fetched GitHub issues and linked Azure Boards work items. Teams that
track work in Jira got no ticket context. This adds provider-agnostic Jira
ticket lookup.

The lookup uses get_user_description() and get_pr_branch() plus the PR
title, which every git provider implements, so it works on GitHub, GitLab,
Bitbucket, and Azure DevOps. It is a no-op when [jira] is not configured.

- find_jira_tickets(): extract Jira keys from the PR title, description and
  branch name. Matching is case-insensitive and keys are normalized to upper
  case, so lowercased branch names (e.g. bugfix/abc-123-x) are detected.
  First-seen order is preserved so the MAX_TICKETS cap is deterministic.
- _get_jira_client() / extract_jira_tickets(): fetch each key via the
  atlassian-python-api Jira client (already a dependency, used by the
  Bitbucket providers). Supports Jira Cloud (email + token) and Server/Data
  Center (PAT). The REST API version is pinned to v2 (JIRA_API_VERSION):
  v2 returns description and custom fields as plain strings, while v3 returns
  ADF JSON dicts that would need separate parsing.
- add_jira_tickets(): provider-agnostic step in extract_tickets() that
  appends Jira tickets to tickets_content, de-duplicated by URL.
- jira_requirements_field maps an instance-specific custom field (e.g.
  customfield_10127) to the ticket "requirements" section. Both the body and
  the requirements field are truncated to MAX_TICKET_CHARACTERS.
- Add a [jira] config section and secrets-template entry using the key names
  from the existing docs. Credentials are read from settings/env, not
  committed.
- Add unit tests for key extraction, ticket fetching (auth modes, capping,
  skip-on-error, truncation), client construction, and the provider-agnostic
  append. An autouse fixture restores the JIRA.* settings between tests.
- Document Jira support in the fetching-ticket-context docs.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@dellch dellch force-pushed the feat/jira-ticket-context branch from ebc7cff to 541c220 Compare June 2, 2026 20:39
dellch added 2 commits June 2, 2026 13:59
extract_tickets() capped provider-native tickets to MAX_TICKETS, then
add_jira_tickets() appended up to MAX_TICKETS more, so a PR could carry up
to 2 * MAX_TICKETS ticket bodies into the review prompt.

Treat MAX_TICKETS as the combined per-PR cap: provider-native tickets
already in tickets_content count against it, and Jira tickets are appended
only until the total reaches MAX_TICKETS, keeping existing tickets first.
Skip the Jira lookup entirely (no client init) when the cap is already met.

Co-Authored-By: Claude
extract_jira_tickets() built the Jira client before checking whether the
text contained any Jira keys. Since add_jira_tickets() runs for every
provider, a configured-but-unreferenced Jira would construct a client (and
log an init failure if misconfigured) on every keyless PR.

Extract keys first and return early when there are none, so the client is
only built when there is something to fetch.

Co-Authored-By: Claude
@dellch dellch force-pushed the feat/jira-ticket-context branch from d33bc57 to cb9ee43 Compare June 2, 2026 21:06
dellch and others added 6 commits June 2, 2026 14:08
_get_jira_client() returned None silently whenever base_url + api_token
were not both set. If a user set some [jira] values but not the required
pair, the misconfiguration was invisible and Jira lookup just never ran.

Log a warning naming the missing keys when Jira is partially configured,
while staying silent when nothing is set (Jira simply not in use).

Co-Authored-By: Claude
Hoist the base-url normalization into a local so the two Jira(...) calls
fit comfortably within the 120-char line limit, and drop the duplicated
.rstrip("/").

Co-Authored-By: Claude
The Azure DevOps branch of extract_tickets() truncated the work-item body
to MAX_TICKET_CHARACTERS but passed acceptance_criteria into "requirements"
unbounded, so a large field could inject an oversized blob into the review
prompt. This mirrors the truncation the Jira path in this PR already applies.

Cap requirements to MAX_TICKET_CHARACTERS and guard against non-string
values, matching the body handling.

Co-Authored-By: Claude
Building the Jira client from a free-form jira_base_url let repo-controlled
configuration (e.g. pyproject.toml [tool.pr-agent]) redirect the authenticated
request, and the API token with it, to an arbitrary host. PR-Agent merges repo
config into the same settings object that supplies the base URL, so an untrusted
PR could point Jira at an attacker server and exfiltrate the token.

Replace jira_base_url with jira_site (the "<site>" in https://<site>.atlassian.net)
and build the base URL from it. jira_site is validated as a single DNS label, so it
cannot contain the characters (. / : @ # ? etc.) needed to escape *.atlassian.net.
The destination is therefore fixed by construction regardless of what untrusted
config supplies.

This scopes the feature to Jira Cloud. Jira Server / Data Center, which needs a
free-form host, is not supported for now; it can return once base-URL trust is
handled (e.g. a secrets-only source). Cloud authenticates with email + API token,
so the partial-config warning now requires site + email + token.

- Add JIRA_SITE_PATTERN + _jira_cloud_base_url(); validate and build the URL.
- Cloud-only auth in _get_jira_client() (drop the Server/DC PAT branch).
- Update [jira] config + secrets template: jira_site instead of jira_base_url.
- Rewrite the docs Jira section Cloud-only, noting Server/DC is not yet supported
  and why.
- Add TestJiraSiteInjection covering breakout attempts (query/fragment/path/port/
  userinfo/encoded-dot/whitespace), asserting they are rejected and no client is built.

Co-Authored-By: Claude
Reframe the "why Cloud only" note around the design (URL derived from a
validated site name, always an Atlassian host) rather than detailing a
configuration-injection scenario, which overstated this feature's role.

Co-Authored-By: Claude
Four inline module imports (import pr_agent.tools.ticket_pr_compliance_check as m,
from ... import MAX_TICKETS) moved to the top-level import block.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant