feat: add Jira ticket-context support (provider-agnostic) by dellch · Pull Request #2 · dellch/pr-agent

dellch · 2026-06-02T19:41:21Z

Draft on my fork for self-review before raising upstream (relates to The-PR-Agent#2162). Not yet a PR against the upstream repo.

Summary

Adds Jira ticket-context support to the review tool's ticket-analysis flow (require_ticket_analysis_review). Previously only GitHub issues and Azure Boards work items were fetched; teams tracking work in Jira got no ticket context.

The lookup is provider-agnostic — it only uses get_user_description() and get_pr_branch() (plus the PR title), which every git provider implements — so Jira context works on GitHub, GitLab, Bitbucket, Azure DevOps, etc. It is a no-op when [jira] is not configured.

What it does

find_jira_tickets(): extract Jira keys from PR title, description and branch name (case-insensitive, normalized to upper case so bugfix/abc-123-x is detected).
extract_jira_tickets() / _get_jira_client(): fetch each key via the atlassian-python-api Jira client (already a dependency; used by the Bitbucket providers). Supports Jira Cloud (email + token) and Server/DC (PAT).
add_jira_tickets(): provider-agnostic step in extract_tickets() that appends Jira tickets to tickets_content, de-duplicated by URL.
jira_requirements_field: maps an instance-specific custom field (e.g. customfield_10127) to the ticket "requirements" section (acceptance criteria).
MAX_TICKETS / MAX_TICKET_CHARACTERS module constants, reused across the file.- Unit tests, plus a docs update.

Verified end-to-end against a live Jira Cloud instance (v2 API returns the fields asplain strings — no ADF parsing needed; cap + skip-on-404 matches the existing GitHubextraction behavior).

Open questions (for the maintainer discussion)

I am leaving these for maintainer input rather than choosing the API shape in this PR:

Config key naming. This PR follows the key names the docs already publish ([jira] jira_api_token, jira_api_email, jira_base_url). Because they sit under the [jira] table, settings resolve to JIRA.JIRA_API_TOKEN and the env override is jira__jira_api_token — the repeated jira_ is redundant. Options: keep as-is for compatibility with the currently documented keys, or drop the prefix (api_token, etc., like [bitbucket_server]) for clarity at the cost of changing the documented contract.
A dedicated issue-provider config section. Rather than a [jira] section, would you prefer something like [related_issues] with issue_provider = "jira"? That gives a natural home for future, provider-neutral options (see below) instead of accreting keys under [jira]. This would change the configuration shape, so worth deciding before it ships.
Scope of this PR vs. follow-ups. This PR is intentionally minimal (fetch + map + tests). Should any of the following be in this PR, or separate follow-ups?
- jira_project_keys — Jira-specific: restrict key matching to configured project prefixes (e.g. DVT, ABC), reducing false positives like utf-8 / sha-1. Doesn't apply to GitHub (numeric #123 refs) or Azure DevOps (native linked work items, no text extraction), so it's named to make the binding explicit. (This differs from Jira issue provider support (minimal) The-PR-Agent/pr-agent#2162, where valid_project_keys filters JQL search results; here it would filter keys parsed from PR text before fetch.)
- max_tickets_to_fetch and similar tunables (currently the fixed MAX_TICKETS = 3, matching the existing GitHub path).
- The broader Issue / IssueProvider abstraction from Jira issue provider support (minimal) The-PR-Agent/pr-agent#2162 — left out here as it may be premature abstraction with only one consumer.
False-positive handling. Key extraction is heuristic; non-existent keys 404 and are skipped, and candidates are capped at 3 — the same trade-off the existing GitHub #123 extraction already makes. Is the current fetch-and-skip behavior acceptable for the first PR, or should strict project-key filtering (option 3's jira_project_keys) be required before merge?

Out of scope

/similar_issue, embedding client, vector-DB integration, and the issue-provider abstraction layer (all part of The-PR-Agent#2162) are intentionally excluded.

Test plan

pytest tests/unittest/test_jira_ticket_extraction.py (25 tests)
No regressions in test_extract_issue_from_branch.py
Live validation against a Jira Cloud instance

🤖 Generated with Claude Code

dellch · 2026-06-02T19:48:06Z

PR description wording suggestions for the upstream version:

The Open Questions are clear. I would keep them, but change “flagging rather than deciding” to “I am leaving these for maintainer input rather than choosing the API shape in this PR.” It is a little more direct.
In question 1, “doc-compatibility” could read as only documentation compatibility. Consider “compatibility with the currently documented keys” instead.
In question 3, the valid_project_keys note is useful but dense. Suggested wording: “This differs from Jira issue provider support (minimal) The-PR-Agent/pr-agent#2162, where valid_project_keys filters JQL search results; here it would filter keys parsed from PR text before fetch.”
In question 4, consider naming the default trade-off explicitly: “Is the current fetch-and-skip behavior acceptable for the first PR, or should strict project-key filtering be required before merge?”

dellch

Reviewed at f25f0d2. Ran the new suite (25 passed) and test_extract_issue_from_branch.py (12 passed, no regressions).

The refactor is sound. I checked against main: the old GitHub path returned inside if tickets:, and it now falls through to a single return tickets_content at the end, so the GitHub/Azure output is preserved and the new Jira step runs for every provider as intended. The only behavioural change I found is GitHub-with-no-tickets now returns [] instead of None — both falsy at the call site, so no effect.

On the dependency: atlassian-python-api==3.41.4 is already pinned in requirements.txt (the Bitbucket providers import from it), so the new top-level from atlassian import Jira adds nothing new and is safe in this core module.

Three small inline comments below; none touch the open questions. The one worth acting on is the uncapped requirements field.

Open questions - wording feedback

The framing (flag, don't decide) is good and Q2/Q4 read clearly. Two suggestions:

Q3, the jira_project_keys bullet is the densest of the set. "...so it's named to make the binding explicit" is vague - "the binding" is never defined. Suggested rewrite: "...so the name jira_project_keys makes explicit that it only governs Jira key extraction, not the GitHub or Azure DevOps paths." Also consider splitting out the #2162 valid_project_keys parenthetical into its own sentence: the bullet currently packs three ideas (what it does, why it is Jira-only, how it differs from The-PR-Agent#2162) into one.
Q1 is accurate but long. Optional tighten: lead with the decision - "Keep the documented jira_* keys, or drop the redundant prefix (api_token, etc.)?" - and keep the JIRA.JIRA_API_TOKEN / jira__jira_api_token detail as the supporting sentence.

I deliberately left the open-question topics alone in the inline comments: config-key naming (Q1), the [jira] vs [related_issues] section (Q2), the MAX_TICKETS / jira_project_keys tunables (Q3), and false-positive matching (Q4).

dellch · 2026-06-02T20:20:47Z

+
+        fields = issue.get("fields", {}) or {}
+        body = fields.get("description") or ""
+        if not isinstance(body, str):


Minor / forward-looking. On Jira Cloud the REST v3 API returns description (and rich-text custom fields) as ADF JSON - a dict, not a string. _get_jira_client() doesn't pass api_version, so atlassian-python-api defaults to v2 and you get plain strings today, which is why this works (and matches what the PR description verified).

If anyone later switches to v3, this isinstance guard silently drops the body to "". A one-line code comment noting the v2 assumption would stop a future reader from reading the empty body as a bug.

Fixed in b7e528a. Added a code comment above the description lookup noting the client defaults to REST v2 (plain strings) and that v3 would return ADF JSON dicts the isinstance guards would turn into empty strings — so a future reader sees the assumption rather than reading the empty body as a bug.

Followed up in 9c22eea: rather than rely on the library default + a comment, _get_jira_client() now passes api_version="2" explicitly for both auth modes, with tests asserting it. Confirmed atlassian-python-api currently hardcodes api_version="2" when unset, so this just pins the contract so a future dependency bump can't silently switch us to v3 (ADF dicts) and empty the bodies.

The review tool's ticket-analysis flow (require_ticket_analysis_review) only fetched GitHub issues and linked Azure Boards work items. Teams that track work in Jira got no ticket context. This adds provider-agnostic Jira ticket lookup. The lookup uses get_user_description() and get_pr_branch() plus the PR title, which every git provider implements, so it works on GitHub, GitLab, Bitbucket, and Azure DevOps. It is a no-op when [jira] is not configured. - find_jira_tickets(): extract Jira keys from the PR title, description and branch name. Matching is case-insensitive and keys are normalized to upper case, so lowercased branch names (e.g. bugfix/abc-123-x) are detected. First-seen order is preserved so the MAX_TICKETS cap is deterministic. - _get_jira_client() / extract_jira_tickets(): fetch each key via the atlassian-python-api Jira client (already a dependency, used by the Bitbucket providers). Supports Jira Cloud (email + token) and Server/Data Center (PAT). The REST API version is pinned to v2 (JIRA_API_VERSION): v2 returns description and custom fields as plain strings, while v3 returns ADF JSON dicts that would need separate parsing. - add_jira_tickets(): provider-agnostic step in extract_tickets() that appends Jira tickets to tickets_content, de-duplicated by URL. - jira_requirements_field maps an instance-specific custom field (e.g. customfield_10127) to the ticket "requirements" section. Both the body and the requirements field are truncated to MAX_TICKET_CHARACTERS. - Add a [jira] config section and secrets-template entry using the key names from the existing docs. Credentials are read from settings/env, not committed. - Add unit tests for key extraction, ticket fetching (auth modes, capping, skip-on-error, truncation), client construction, and the provider-agnostic append. An autouse fixture restores the JIRA.* settings between tests. - Document Jira support in the fetching-ticket-context docs. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

extract_tickets() capped provider-native tickets to MAX_TICKETS, then add_jira_tickets() appended up to MAX_TICKETS more, so a PR could carry up to 2 * MAX_TICKETS ticket bodies into the review prompt. Treat MAX_TICKETS as the combined per-PR cap: provider-native tickets already in tickets_content count against it, and Jira tickets are appended only until the total reaches MAX_TICKETS, keeping existing tickets first. Skip the Jira lookup entirely (no client init) when the cap is already met. Co-Authored-By: Claude

extract_jira_tickets() built the Jira client before checking whether the text contained any Jira keys. Since add_jira_tickets() runs for every provider, a configured-but-unreferenced Jira would construct a client (and log an init failure if misconfigured) on every keyless PR. Extract keys first and return early when there are none, so the client is only built when there is something to fetch. Co-Authored-By: Claude

_get_jira_client() returned None silently whenever base_url + api_token were not both set. If a user set some [jira] values but not the required pair, the misconfiguration was invisible and Jira lookup just never ran. Log a warning naming the missing keys when Jira is partially configured, while staying silent when nothing is set (Jira simply not in use). Co-Authored-By: Claude

Hoist the base-url normalization into a local so the two Jira(...) calls fit comfortably within the 120-char line limit, and drop the duplicated .rstrip("/"). Co-Authored-By: Claude

The Azure DevOps branch of extract_tickets() truncated the work-item body to MAX_TICKET_CHARACTERS but passed acceptance_criteria into "requirements" unbounded, so a large field could inject an oversized blob into the review prompt. This mirrors the truncation the Jira path in this PR already applies. Cap requirements to MAX_TICKET_CHARACTERS and guard against non-string values, matching the body handling. Co-Authored-By: Claude

Building the Jira client from a free-form jira_base_url let repo-controlled configuration (e.g. pyproject.toml [tool.pr-agent]) redirect the authenticated request, and the API token with it, to an arbitrary host. PR-Agent merges repo config into the same settings object that supplies the base URL, so an untrusted PR could point Jira at an attacker server and exfiltrate the token. Replace jira_base_url with jira_site (the "<site>" in https://<site>.atlassian.net) and build the base URL from it. jira_site is validated as a single DNS label, so it cannot contain the characters (. / : @ # ? etc.) needed to escape *.atlassian.net. The destination is therefore fixed by construction regardless of what untrusted config supplies. This scopes the feature to Jira Cloud. Jira Server / Data Center, which needs a free-form host, is not supported for now; it can return once base-URL trust is handled (e.g. a secrets-only source). Cloud authenticates with email + API token, so the partial-config warning now requires site + email + token. - Add JIRA_SITE_PATTERN + _jira_cloud_base_url(); validate and build the URL. - Cloud-only auth in _get_jira_client() (drop the Server/DC PAT branch). - Update [jira] config + secrets template: jira_site instead of jira_base_url. - Rewrite the docs Jira section Cloud-only, noting Server/DC is not yet supported and why. - Add TestJiraSiteInjection covering breakout attempts (query/fragment/path/port/ userinfo/encoded-dot/whitespace), asserting they are rejected and no client is built. Co-Authored-By: Claude

Reframe the "why Cloud only" note around the design (URL derived from a validated site name, always an Atlassian host) rather than detailing a configuration-injection scenario, which overstated this feature's role. Co-Authored-By: Claude

Four inline module imports (import pr_agent.tools.ticket_pr_compliance_check as m, from ... import MAX_TICKETS) moved to the top-level import block. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

dellch force-pushed the feat/jira-ticket-context branch from afa4a5c to f50891d Compare June 2, 2026 19:43