feat: add Jira ticket-context support (provider-agnostic)#2
Conversation
afa4a5c to
f50891d
Compare
|
PR description wording suggestions for the upstream version:
|
f50891d to
1c0adaf
Compare
27a688e to
f25f0d2
Compare
dellch
left a comment
There was a problem hiding this comment.
Reviewed at f25f0d2. Ran the new suite (25 passed) and test_extract_issue_from_branch.py (12 passed, no regressions).
The refactor is sound. I checked against main: the old GitHub path returned inside if tickets:, and it now falls through to a single return tickets_content at the end, so the GitHub/Azure output is preserved and the new Jira step runs for every provider as intended. The only behavioural change I found is GitHub-with-no-tickets now returns [] instead of None — both falsy at the call site, so no effect.
On the dependency: atlassian-python-api==3.41.4 is already pinned in requirements.txt (the Bitbucket providers import from it), so the new top-level from atlassian import Jira adds nothing new and is safe in this core module.
Three small inline comments below; none touch the open questions. The one worth acting on is the uncapped requirements field.
Open questions - wording feedback
The framing (flag, don't decide) is good and Q2/Q4 read clearly. Two suggestions:
- Q3, the
jira_project_keysbullet is the densest of the set. "...so it's named to make the binding explicit" is vague - "the binding" is never defined. Suggested rewrite: "...so the namejira_project_keysmakes explicit that it only governs Jira key extraction, not the GitHub or Azure DevOps paths." Also consider splitting out the#2162 valid_project_keysparenthetical into its own sentence: the bullet currently packs three ideas (what it does, why it is Jira-only, how it differs from The-PR-Agent#2162) into one. - Q1 is accurate but long. Optional tighten: lead with the decision - "Keep the documented
jira_*keys, or drop the redundant prefix (api_token, etc.)?" - and keep theJIRA.JIRA_API_TOKEN/jira__jira_api_tokendetail as the supporting sentence.
I deliberately left the open-question topics alone in the inline comments: config-key naming (Q1), the [jira] vs [related_issues] section (Q2), the MAX_TICKETS / jira_project_keys tunables (Q3), and false-positive matching (Q4).
|
|
||
| fields = issue.get("fields", {}) or {} | ||
| body = fields.get("description") or "" | ||
| if not isinstance(body, str): |
There was a problem hiding this comment.
Minor / forward-looking. On Jira Cloud the REST v3 API returns description (and rich-text custom fields) as ADF JSON - a dict, not a string. _get_jira_client() doesn't pass api_version, so atlassian-python-api defaults to v2 and you get plain strings today, which is why this works (and matches what the PR description verified).
If anyone later switches to v3, this isinstance guard silently drops the body to "". A one-line code comment noting the v2 assumption would stop a future reader from reading the empty body as a bug.
There was a problem hiding this comment.
Fixed in b7e528a. Added a code comment above the description lookup noting the client defaults to REST v2 (plain strings) and that v3 would return ADF JSON dicts the isinstance guards would turn into empty strings — so a future reader sees the assumption rather than reading the empty body as a bug.
There was a problem hiding this comment.
Followed up in 9c22eea: rather than rely on the library default + a comment, _get_jira_client() now passes api_version="2" explicitly for both auth modes, with tests asserting it. Confirmed atlassian-python-api currently hardcodes api_version="2" when unset, so this just pins the contract so a future dependency bump can't silently switch us to v3 (ADF dicts) and empty the bodies.
f25f0d2 to
b7e528a
Compare
9c22eea to
ebc7cff
Compare
The review tool's ticket-analysis flow (require_ticket_analysis_review) only fetched GitHub issues and linked Azure Boards work items. Teams that track work in Jira got no ticket context. This adds provider-agnostic Jira ticket lookup. The lookup uses get_user_description() and get_pr_branch() plus the PR title, which every git provider implements, so it works on GitHub, GitLab, Bitbucket, and Azure DevOps. It is a no-op when [jira] is not configured. - find_jira_tickets(): extract Jira keys from the PR title, description and branch name. Matching is case-insensitive and keys are normalized to upper case, so lowercased branch names (e.g. bugfix/abc-123-x) are detected. First-seen order is preserved so the MAX_TICKETS cap is deterministic. - _get_jira_client() / extract_jira_tickets(): fetch each key via the atlassian-python-api Jira client (already a dependency, used by the Bitbucket providers). Supports Jira Cloud (email + token) and Server/Data Center (PAT). The REST API version is pinned to v2 (JIRA_API_VERSION): v2 returns description and custom fields as plain strings, while v3 returns ADF JSON dicts that would need separate parsing. - add_jira_tickets(): provider-agnostic step in extract_tickets() that appends Jira tickets to tickets_content, de-duplicated by URL. - jira_requirements_field maps an instance-specific custom field (e.g. customfield_10127) to the ticket "requirements" section. Both the body and the requirements field are truncated to MAX_TICKET_CHARACTERS. - Add a [jira] config section and secrets-template entry using the key names from the existing docs. Credentials are read from settings/env, not committed. - Add unit tests for key extraction, ticket fetching (auth modes, capping, skip-on-error, truncation), client construction, and the provider-agnostic append. An autouse fixture restores the JIRA.* settings between tests. - Document Jira support in the fetching-ticket-context docs. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
ebc7cff to
541c220
Compare
extract_tickets() capped provider-native tickets to MAX_TICKETS, then add_jira_tickets() appended up to MAX_TICKETS more, so a PR could carry up to 2 * MAX_TICKETS ticket bodies into the review prompt. Treat MAX_TICKETS as the combined per-PR cap: provider-native tickets already in tickets_content count against it, and Jira tickets are appended only until the total reaches MAX_TICKETS, keeping existing tickets first. Skip the Jira lookup entirely (no client init) when the cap is already met. Co-Authored-By: Claude
extract_jira_tickets() built the Jira client before checking whether the text contained any Jira keys. Since add_jira_tickets() runs for every provider, a configured-but-unreferenced Jira would construct a client (and log an init failure if misconfigured) on every keyless PR. Extract keys first and return early when there are none, so the client is only built when there is something to fetch. Co-Authored-By: Claude
d33bc57 to
cb9ee43
Compare
_get_jira_client() returned None silently whenever base_url + api_token were not both set. If a user set some [jira] values but not the required pair, the misconfiguration was invisible and Jira lookup just never ran. Log a warning naming the missing keys when Jira is partially configured, while staying silent when nothing is set (Jira simply not in use). Co-Authored-By: Claude
Hoist the base-url normalization into a local so the two Jira(...) calls
fit comfortably within the 120-char line limit, and drop the duplicated
.rstrip("/").
Co-Authored-By: Claude
The Azure DevOps branch of extract_tickets() truncated the work-item body to MAX_TICKET_CHARACTERS but passed acceptance_criteria into "requirements" unbounded, so a large field could inject an oversized blob into the review prompt. This mirrors the truncation the Jira path in this PR already applies. Cap requirements to MAX_TICKET_CHARACTERS and guard against non-string values, matching the body handling. Co-Authored-By: Claude
Building the Jira client from a free-form jira_base_url let repo-controlled configuration (e.g. pyproject.toml [tool.pr-agent]) redirect the authenticated request, and the API token with it, to an arbitrary host. PR-Agent merges repo config into the same settings object that supplies the base URL, so an untrusted PR could point Jira at an attacker server and exfiltrate the token. Replace jira_base_url with jira_site (the "<site>" in https://<site>.atlassian.net) and build the base URL from it. jira_site is validated as a single DNS label, so it cannot contain the characters (. / : @ # ? etc.) needed to escape *.atlassian.net. The destination is therefore fixed by construction regardless of what untrusted config supplies. This scopes the feature to Jira Cloud. Jira Server / Data Center, which needs a free-form host, is not supported for now; it can return once base-URL trust is handled (e.g. a secrets-only source). Cloud authenticates with email + API token, so the partial-config warning now requires site + email + token. - Add JIRA_SITE_PATTERN + _jira_cloud_base_url(); validate and build the URL. - Cloud-only auth in _get_jira_client() (drop the Server/DC PAT branch). - Update [jira] config + secrets template: jira_site instead of jira_base_url. - Rewrite the docs Jira section Cloud-only, noting Server/DC is not yet supported and why. - Add TestJiraSiteInjection covering breakout attempts (query/fragment/path/port/ userinfo/encoded-dot/whitespace), asserting they are rejected and no client is built. Co-Authored-By: Claude
Reframe the "why Cloud only" note around the design (URL derived from a validated site name, always an Atlassian host) rather than detailing a configuration-injection scenario, which overstated this feature's role. Co-Authored-By: Claude
Four inline module imports (import pr_agent.tools.ticket_pr_compliance_check as m, from ... import MAX_TICKETS) moved to the top-level import block. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Summary
Adds Jira ticket-context support to the review tool's ticket-analysis flow (
require_ticket_analysis_review). Previously only GitHub issues and Azure Boards work items were fetched; teams tracking work in Jira got no ticket context.The lookup is provider-agnostic — it only uses
get_user_description()andget_pr_branch()(plus the PR title), which every git provider implements — so Jira context works on GitHub, GitLab, Bitbucket, Azure DevOps, etc. It is a no-op when[jira]is not configured.What it does
find_jira_tickets(): extract Jira keys from PR title, description and branch name (case-insensitive, normalized to upper case sobugfix/abc-123-xis detected).extract_jira_tickets()/_get_jira_client(): fetch each key via theatlassian-python-apiJira client (already a dependency; used by the Bitbucket providers). Supports Jira Cloud (email + token) and Server/DC (PAT).add_jira_tickets(): provider-agnostic step inextract_tickets()that appends Jira tickets totickets_content, de-duplicated by URL.jira_requirements_field: maps an instance-specific custom field (e.g.customfield_10127) to the ticket "requirements" section (acceptance criteria).MAX_TICKETS/MAX_TICKET_CHARACTERSmodule constants, reused across the file.- Unit tests, plus a docs update.Verified end-to-end against a live Jira Cloud instance (v2 API returns the fields asplain strings — no ADF parsing needed; cap + skip-on-404 matches the existing GitHubextraction behavior).
Open questions (for the maintainer discussion)
I am leaving these for maintainer input rather than choosing the API shape in this PR:
Config key naming. This PR follows the key names the docs already publish (
[jira] jira_api_token,jira_api_email,jira_base_url). Because they sit under the[jira]table, settings resolve toJIRA.JIRA_API_TOKENand the env override isjira__jira_api_token— the repeatedjira_is redundant. Options: keep as-is for compatibility with the currently documented keys, or drop the prefix (api_token, etc., like[bitbucket_server]) for clarity at the cost of changing the documented contract.A dedicated issue-provider config section. Rather than a
[jira]section, would you prefer something like[related_issues]withissue_provider = "jira"? That gives a natural home for future, provider-neutral options (see below) instead of accreting keys under[jira]. This would change the configuration shape, so worth deciding before it ships.Scope of this PR vs. follow-ups. This PR is intentionally minimal (fetch + map + tests). Should any of the following be in this PR, or separate follow-ups?
jira_project_keys— Jira-specific: restrict key matching to configured project prefixes (e.g.DVT,ABC), reducing false positives likeutf-8/sha-1. Doesn't apply to GitHub (numeric#123refs) or Azure DevOps (native linked work items, no text extraction), so it's named to make the binding explicit. (This differs from Jira issue provider support (minimal) The-PR-Agent/pr-agent#2162, wherevalid_project_keysfilters JQL search results; here it would filter keys parsed from PR text before fetch.)max_tickets_to_fetchand similar tunables (currently the fixedMAX_TICKETS = 3, matching the existing GitHub path).Issue/IssueProviderabstraction from Jira issue provider support (minimal) The-PR-Agent/pr-agent#2162 — left out here as it may be premature abstraction with only one consumer.False-positive handling. Key extraction is heuristic; non-existent keys 404 and are skipped, and candidates are capped at 3 — the same trade-off the existing GitHub
#123extraction already makes. Is the current fetch-and-skip behavior acceptable for the first PR, or should strict project-key filtering (option 3'sjira_project_keys) be required before merge?Out of scope
/similar_issue, embedding client, vector-DB integration, and the issue-provider abstraction layer (all part of The-PR-Agent#2162) are intentionally excluded.Test plan
pytest tests/unittest/test_jira_ticket_extraction.py(25 tests)test_extract_issue_from_branch.py🤖 Generated with Claude Code