Skip to content

VagishKumar3671/Autonomous-graph-orchestrated-SQL-injection-assessment-framework

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SQLi Agent

Autonomous graph-orchestrated SQL injection assessment framework — integrating authenticated discovery, multi-tool validation, confidence scoring, and structured reporting.

Python 3.12+ LangGraph sqlmap License: MIT

Philosophy: LLM is the writer. Security tools are the judge. Tool → Validation → Confidence → Policy Gate → LLM → Report


Overview

Traditional SQLi assessments require manually chaining authentication, crawling, parameter discovery, and exploitation tools — with high false-positive rates and no reproducibility guarantees.

Astra automates the full pipeline:

  • Authenticated crawl across SPA and REST targets
  • Multi-tool discovery via Playwright, Katana, and Arjun
  • sqlmap-assisted scanning with independent validation fallback
  • Hard policy gate — LLM cannot override security decisions
  • Evidence-mandatory reporting — no payload evidence, no finding

Scope: SQL Injection only (Error-based, Boolean Blind, Time Blind, UNION, Stacked Queries).


Architecture

graph TD
    CLI["CLI\npython main.py scan --url ... --username ... --password ..."]
    AUTH["Auth Node\nPlaywright form login · REST API fallback\nExtracts session cookies + JWT token"]
    CRAWL["Crawl Node\nPlaywright · Katana · Arjun · API probe\nBuilds parameterized endpoint inventory"]
    EXTRACT["Extract Node\nPattern-matched parameter selection\nPrioritizes: q, id, search, filter, user..."]
    SCAN["Scan Node\nParallel sqlmap execution\nAll candidates queued regardless of sqlmap result"]
    VALIDATE["Validation Node\nError test → Boolean test → Time test → Response diff\nCritical DB errors → immediate CONFIRMED"]
    GATE["Hard Policy Gate\nclassify_finding: error_valid → CONFIRMED\npassed < 2 and no critical error → FALSE_POSITIVE"]
    DEEPSCAN["Deep Scan Node\nRetries POSSIBLE/LIKELY with aggressive payloads\nBoosts confidence on boundary cases"]
    LLM["LLM Writer (Ollama)\nJSON-only output · writer not judge\nEnriches CONFIRMED/LIKELY only"]
    REPORT["Report Node\nMarkdown + JSON\nConfidence breakdown table · Evidence block · Quorum status"]

    CLI --> AUTH --> CRAWL --> EXTRACT --> SCAN --> VALIDATE --> GATE
    GATE -->|"unconfirmed"| DEEPSCAN --> LLM --> REPORT
    GATE -->|"all confirmed"| LLM
Loading

Execution Workflow

Stage Node Responsibility
Auth nodes/auth.py Playwright form login + REST API fallback. Extracts cookies & JWT. Aborts pipeline on failure.
Crawl nodes/crawl.py 4-strategy discovery: Playwright page crawl · Katana spider · Arjun parameter brute-force · hardcoded API probes.
Extract nodes/extract.py Pattern-filters parameters (q, id, search, …). Falls back to all params if none match.
Scan nodes/scan.py Parallel sqlmap. All candidates go to validation regardless of sqlmap outcome. sqlmap is signal, not gatekeeper.
Validate nodes/validate.py 4 independent tests. Critical DB errors (SQLITE_ERROR, ORA-, etc.) → immediate CONFIRMED at ≥85% confidence.
Policy Gate validate.py classify_finding() — deterministic. Error valid → CONFIRMED. Zero validations → FALSE_POSITIVE. LLM excluded.
Deep Scan nodes/deep_scan.py Aggressive boolean/error payloads on POSSIBLE/LIKELY. Adds confidence bonus.
Enrich tools/llm_enrichment.py Ollama (llama3). JSON-only structured output. CONFIRMED/LIKELY only. Never overwrites status/confidence.
Report nodes/report.py Markdown + JSON. Itemized confidence breakdown. Evidence block. FALSE_POSITIVEs suppressed.

Tech Stack

Layer Technology Rationale
Orchestration LangGraph StateGraph Typed state, conditional edges, deterministic node routing
Browser automation Playwright (async) SPA-capable authenticated crawling; handles Angular/React login flows
Web spider Katana (projectdiscovery) Fast passive crawl complement; JS-aware link extraction
Parameter discovery Arjun Hidden parameter brute-force; finds non-obvious injection points
SQLi scanner sqlmap Industry-standard; provides initial signal — not sole arbiter
LLM backend Ollama (llama3) Local inference; writer role only; zero vendor dependency
CLI Click Composable command interface; configurable risk/level/depth
State contract Python TypedDict Strict pipeline state typing; prevents silent data loss between nodes

Validation Engine

Each candidate endpoint runs 4 independent tests. The hard policy gate decides status — not the LLM.

Error test     →  Critical DB error pattern match (SQLITE_ERROR, ORA-, PostgreSQL, …)
Boolean test   →  True/False condition differential + baseline disruption detection
Time test      →  Sleep-based with 3-attempt reproducibility confirmation
Response diff  →  Status code / content-type / length delta analysis

Confidence scoring:

Signal Weight
sqlmap detection +40
Critical DB error +40 (floors confidence at 85%)
Boolean validation up to +25
Time validation up to +20
Response diff up to +15
Reproducibility +5

Policy gate rules (deterministic):

if error_valid:                          → CONFIRMED
if passed == 0:                          → FALSE_POSITIVE
if confidence >= 80 and passed >= 2:     → CONFIRMED
if confidence >= 50:                     → LIKELY
else:                                    → POSSIBLE

Repository Structure

astra-sqli-agent/
├── main.py                  # CLI entry point (click)
├── graph/
│   ├── workflow.py          # LangGraph StateGraph, node wiring, conditional routing
│   └── state.py             # PipelineState TypedDict — shared contract across all nodes
├── nodes/
│   ├── auth.py              # Playwright form + REST API authentication
│   ├── crawl.py             # 4-strategy crawl orchestration
│   ├── extract.py           # Parameter candidate selection
│   ├── scan.py              # Parallel sqlmap execution
│   ├── validate.py          # 4-test validation engine + hard policy gate
│   ├── deep_scan.py         # Aggressive retry for unconfirmed findings
│   └── report.py            # Markdown + JSON report generation
├── tools/
│   ├── sqlmap_runner.py     # sqlmap subprocess wrapper
│   ├── sqlmap_parser.py     # sqlmap stdout structured parser
│   ├── playwright_runner.py # Async Playwright session manager
│   ├── katana_runner.py     # Katana subprocess wrapper
│   ├── arjun_runner.py      # Arjun parameter discovery wrapper
│   ├── llm_enrichment.py    # Ollama JSON-structured enrichment
│   └── checker.py           # Startup tool availability check
├── reports/                 # Generated Markdown reports
└── artifacts/               # Generated JSON finding exports

Quick Start

Prerequisites: Python 3.12+, sqlmap, Playwright browsers

git clone https://github.com/your-org/astra-sqli-agent.git
cd astra-sqli-agent
pip install -r requirements.txt
playwright install chromium

Run:

python main.py scan \
  --url http://target \
  --username user@example.com \
  --password yourpassword \
  --depth 2 \
  --max-pages 30 \
  --sqlmap-timeout 90 \
  --risk 2 \
  --level 3

Optional tools (increase discovery coverage):

brew install katana        # Fast web spider
pip install arjun          # Parameter discovery

Sample Output

Terminal:

  [✓] Sqlmap       /usr/bin/sqlmap
  [✓] Playwright   installed
  [✓] Katana       /opt/homebrew/bin/katana
  [✓] Arjun        /usr/local/bin/arjun
  [✓] Ollama       running

[VALIDATE] CRITICAL DB ERROR on 'q' payload='test'':
           SQLITE_ERROR: near "'%'": syntax error
[VALIDATE] 'q' -> CONFIRMED (confidence=85% quorum=3/4 critical=True)

Done in 176s  |  Findings: 1 confirmed, 0 likely, 0 possible

Generated report finding:

## Finding 1 — 🔴 CONFIRMED

- Endpoint:   /rest/products/search?q=...
- Parameter:  q
- Type:       Error-based SQL Injection
- Confidence: 85%  |  Severity: Critical
- Quorum:     3/4 validations passed

| Component          | Score |
|--------------------|-------|
| critical DB error  | +40   |
| boolean validation | +25   |
| response diff      | +15   |
| repeatability      | +5    |
| Final Confidence   | 85%   |

Evidence: payload=`test'` → SQLITE_ERROR: near "'%'": syntax error
Recommendation: Use parameterized queries. Disable verbose DB error messages.

CLI Reference

Flag Default Description
--url required Target base URL
--username required Login credential
--password required Login credential
--depth 3 Playwright crawl depth
--max-pages 100 Max pages per crawl
--sqlmap-timeout 120 Per-endpoint sqlmap timeout (seconds)
--risk 2 sqlmap risk level (1–3)
--level 3 sqlmap test level (1–5)

Scope

Supported Out of Scope
Error-based SQLi XSS
Boolean Blind SQLi SSRF / IDOR
Time Blind SQLi JWT attacks
UNION-based SQLi Command injection
Stacked Queries (detection) File upload attacks

About

Autonomous graph-orchestrated SQL injection assessment framework integrating discovery, validation, exploitation and reporting workflows.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Contributors

Languages