feat: add CLI for resume analysis via levelup command#62
Conversation
WalkthroughThis PR introduces a new Typer-based CLI tool that analyzes PDF resumes using the Gemini API. The ChangesResume Analysis CLI
Sequence DiagramsequenceDiagram
participant User as User/CLI
participant Validate as Validation
participant PDFExt as PDF Extractor
participant Gemini as Gemini API
participant JSONParse as JSON Parser
participant Output as Output Handler
User->>Validate: analyze(resume, language, role, output)
Validate->>Validate: check file exists
Validate->>Validate: check language supported
Validate->>Validate: check API key set
Validate->>PDFExt: extract_pdf_text(resume)
PDFExt->>User: plain text content
Validate->>Gemini: call gemini-2.0-flash-lite
Gemini->>User: JSON response
User->>JSONParse: extract_json_from_response()
JSONParse->>User: parsed JSON object
User->>Output: write to file or stdout
Output->>User: success or error exit
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@levelup/cli.py`:
- Around line 47-48: The current greedy regex that sets block = match.group(0)
if match else None can over-match; replace it with an iterative JSON boundary
detection: find the first '{' in raw (return None if none), then iterate from
that start index to each subsequent '}' building candidate substrings, try
json.loads(candidate) for each, and when a candidate successfully parses to a
dict return that result; if none parse return None. Update the logic that
assigns block/result to use this approach (references: variable raw, block and
the parsing branch that currently uses re.search) so trailing braces in raw no
longer corrupt the extracted JSON.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: a0fe4e7b-58f1-4322-8582-9a6a68b3e3f5
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (2)
levelup/cli.pypyproject.toml
| match = re.search(r"\{[\s\S]*\}", raw) | ||
| block = match.group(0) if match else None |
There was a problem hiding this comment.
Greedy regex may over-match if trailing text contains braces.
The pattern \{[\s\S]*\} matches from the first { to the last } in the string. If the LLM response includes trailing text with braces (e.g., {"score": 90} Note: see {docs}), this captures invalid content and causes parsing to fail or return corrupted data.
Consider using a balanced-brace approach or iteratively trying to parse progressively smaller substrings:
Proposed fix: iterative JSON boundary detection
def _extract_json(raw: str) -> dict | None:
fence = re.search(r"```(?:json)?\s*({[\s\S]*?})\s*```", raw, re.IGNORECASE)
if fence:
block = fence.group(1)
else:
- match = re.search(r"\{[\s\S]*\}", raw)
- block = match.group(0) if match else None
+ # Find first '{' and try parsing from there to each subsequent '}'
+ start = raw.find("{")
+ if start == -1:
+ return None
+ block = None
+ for i, ch in enumerate(raw[start:], start):
+ if ch == "}":
+ candidate = raw[start : i + 1]
+ try:
+ result = json.loads(candidate)
+ if isinstance(result, dict):
+ return result
+ except json.JSONDecodeError:
+ continue
+ return None
if not block:
return None
try:🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@levelup/cli.py` around lines 47 - 48, The current greedy regex that sets
block = match.group(0) if match else None can over-match; replace it with an
iterative JSON boundary detection: find the first '{' in raw (return None if
none), then iterate from that start index to each subsequent '}' building
candidate substrings, try json.loads(candidate) for each, and when a candidate
successfully parses to a dict return that result; if none parse return None.
Update the logic that assigns block/result to use this approach (references:
variable raw, block and the parsing branch that currently uses re.search) so
trailing braces in raw no longer corrupt the extracted JSON.
Closes #2
Summary by CodeRabbit
Release Notes
levelupcommand-line tool for AI-powered resume analysis