-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Labels
Description
Problem
Currently, the git diff output is sent directly to the OpenAI API without validation (index.js:103-108). This creates security and cost risks:
- Security Risk: Could inadvertently send sensitive data (API keys, passwords, tokens) to OpenAI
- Cost Risk: Large diffs could exceed token limits, causing API errors or unexpected costs
- User Experience: No warnings when potentially sensitive content is detected
Current Code
const { stdout } = await exec(
"git diff --cached -- . ':(exclude)*lock.json' ':(exclude)*lock.yaml'"
)
const summary = stdout.trim()
if (summary.length === 0) {
return null
}
return summary // ⚠️ No validation before sending to OpenAIProposed Solution
Add validation with size limits and sensitive data detection:
const MAX_DIFF_SIZE = 10000 // characters
const SENSITIVE_PATTERNS = [
/api[_-]?key/i,
/password/i,
/secret/i,
/token/i,
/-----BEGIN [A-Z]+ PRIVATE KEY-----/
]
function validateDiffSafety(diff) {
// Check size
if (diff.length > MAX_DIFF_SIZE) {
throw new Error(
`Git diff too large (${diff.length} chars). Limit: ${MAX_DIFF_SIZE}\n` +
'Consider committing in smaller chunks.'
)
}
// Check for sensitive data
const detectedPatterns = []
for (const pattern of SENSITIVE_PATTERNS) {
if (pattern.test(diff)) {
detectedPatterns.push(pattern.toString())
}
}
if (detectedPatterns.length > 0) {
console.warn('⚠️ Warning: Potential sensitive data detected in diff:')
detectedPatterns.forEach(p => console.warn(` - Pattern: ${p}`))
console.warn('\nThis content will be sent to OpenAI API.')
// Could add user confirmation prompt here
}
}
// In getGitSummary():
const summary = stdout.trim()
if (summary.length === 0) {
return null
}
validateDiffSafety(summary)
return summaryBenefits
- ✅ Prevents accidental exposure of sensitive data
- ✅ Controls OpenAI API costs
- ✅ Improves user awareness
- ✅ Configurable limits and patterns
Acceptance Criteria
- Add size validation for git diff output
- Add sensitive data pattern detection
- Display warnings to user when patterns detected
- Add configuration options for size limits
- Add tests for validation logic
- Update documentation
Priority
High - Security and cost implications
Related
Quality analysis report: claudedocs/quality-analysis-report.md section 3.3