diff --git a/.github/workflows/release.yml b/.github/workflows/release.yml index 7a8a254..1718b48 100644 --- a/.github/workflows/release.yml +++ b/.github/workflows/release.yml @@ -74,25 +74,24 @@ jobs: --ignore-unfixed \ ${{ env.DOCKER_IMAGE }}:${{ steps.version.outputs.VERSION }} - # PyPI publish — uncomment PYPI_API_TOKEN secret when ready - # pypi-publish: - # name: PyPI Publish - # needs: test - # runs-on: ubuntu-latest - # environment: release - # steps: - # - uses: actions/checkout@v4 - # - uses: actions/setup-python@v5 - # with: - # python-version: "3.13" - # - run: pip install --upgrade pip build twine - # - name: Build package - # run: python -m build - # - name: Publish to PyPI - # env: - # TWINE_USERNAME: __token__ - # TWINE_PASSWORD: ${{ secrets.PYPI_API_TOKEN }} - # run: python -m twine upload dist/* + pypi-publish: + name: PyPI Publish + needs: test + runs-on: ubuntu-latest + environment: release + steps: + - uses: actions/checkout@v4 + - uses: actions/setup-python@v5 + with: + python-version: "3.13" + - run: pip install --upgrade pip build twine + - name: Build package + run: python -m build + - name: Publish to PyPI + env: + TWINE_USERNAME: __token__ + TWINE_PASSWORD: ${{ secrets.PYPI_API_TOKEN }} + run: python -m twine upload dist/* github-release: name: GitHub Release diff --git a/.gitignore b/.gitignore index 590c5b8..2897177 100644 --- a/.gitignore +++ b/.gitignore @@ -6,6 +6,7 @@ build/ .eggs/ *.egg +marketing/ # Virtual environments .venv/ venv/ diff --git a/IMPROVEMENTS.md b/IMPROVEMENTS.md new file mode 100644 index 0000000..a9e5fe2 --- /dev/null +++ b/IMPROVEMENTS.md @@ -0,0 +1,430 @@ +# NHInsight — Usability Improvement Plan + +*Practical analysis and code-level patches for developer adoption.* + +--- + +## 1. Executive Summary + +NHInsight is a solid v0.1 CLI. The core pipeline works: discover → classify → risk-score → attack-path → output. 151 tests pass. Five providers ship. The code is clean. + +The adoption bottleneck is **not functionality** — it's **first-impression UX**. A developer landing on the repo needs to go from zero to "oh that's useful" in under 30 seconds. Right now the README, CLI, and demo all add friction that makes the tool feel heavier than it is. + +**High-leverage changes (all small):** + +1. **README** — restructure as landing page, push docs lower +2. **CLI** — friendly no-args help, better no-provider error, post-demo suggestions +3. **PyPI** — better description, classifiers, drop setup.py +4. **Demo** — add "try next" footer after demo runs +5. **Output** — tighten severity labels, improve the HIGH icon +6. **Attack paths** — better `--attack-paths` help text, plain-English chain descriptions +7. **Patch plan** — 4 files, ~80 lines changed, shippable in one session + +None of these require new subsystems, dependencies, or architecture changes. + +--- + +## 2. Biggest Adoption Blockers + +Ranked by impact on a first-time GitHub visitor: + +| # | Blocker | Where | Fix Effort | +|---|---------|-------|------------| +| 1 | README puts Installation + Docker before Quick Start | README.md | Reorder | +| 2 | Running `nhinsight` with no args shows argparse default (ugly) | cli.py | 10 lines | +| 3 | Running `nhinsight scan` with no provider says "No providers specified" (no guidance) | cli.py | 5 lines | +| 4 | Demo ends silently — no "try this next" | cli.py | 8 lines | +| 5 | README "What It Finds" is 60 lines of tables before features | README.md | Collapse | +| 6 | HIGH severity uses same 🔴 icon as CRITICAL (confusing) | output.py | 1 line | +| 7 | `--attack-paths` help text is generic | cli.py | 1 line | +| 8 | PyPI description is too generic | pyproject.toml | 1 line | + +All fixable in a single PR. + +--- + +## 3. README Rewrite + +**See the actual README.md in this repo** — I will implement this directly. + +Structure: +1. Hero (title + one-liner + badges) +2. Quick Start (pip install + demo — 2 commands) +3. Scan examples (5 providers + multi) +4. Example output (compact, screenshot-friendly) +5. What It Finds (6 bullets, not tables) +6. Supported Providers (5 bullets) +7. Key Capabilities (6 bullets) +8. Install Options (4 pip lines + collapsible Docker) +9. Authentication (quick table + collapsible detail) +10. Attack Path Analysis (always visible — differentiator) +11. Risk Codes (collapsible) +12. Configuration (collapsible) +13. CLI Reference (collapsible) +14. Development (4 lines + collapsible) +15. Roadmap (5 one-liners) +16. Why NHInsight? (problem statement at bottom, not top) +17. Contributing / Related / License + +Key decisions: +- Quick Start is line 15, not line 107 +- "The Problem" becomes "Why NHInsight?" at the bottom (credibility, not first-screen) +- Risk code tables are collapsed — impressive when opened, not blocking when closed +- Auth detail is collapsed — quick table always visible +- Docker examples collapsed +- Architecture + Makefile collapsed +- Roadmap condensed to one-liners + +--- + +## 4. CLI UX Improvements + +### 4a. No-args behavior (`nhinsight` with nothing) + +**Current:** Shows argparse default help (functional but cold). + +**Improved:** Same help output but add a highlighted quick-start hint at the end. + +Change in `main()` at `cli.py:1132`: + +```python +else: + parser.print_help() + print(f"\n {BOLD}Quick start:{RESET}") + print(f" nhinsight demo # see sample data, no credentials") + print(f" nhinsight scan --aws # scan your AWS account") + print() +``` + +### 4b. No-provider error (`nhinsight scan` with no flags) + +**Current (line 203):** +``` +No providers specified. Use --aws, --azure, --gcp, --github, --k8s, or --all +``` + +**Improved:** +``` +No providers selected. + + Quick examples: + nhinsight scan --aws Scan AWS IAM + nhinsight scan --all --attack-paths Scan everything + nhinsight demo Try with sample data first + + Providers: --aws --azure --gcp --github --k8s --all +``` + +### 4c. Provider auth failure messages + +**Current (line 216):** +``` +AWS credentials not available. Configure AWS CLI or set AWS_PROFILE. +``` + +These are already good. Minor improvement — add the exact command: + +``` +AWS: credentials not found. Run 'aws configure' or set AWS_ACCESS_KEY_ID. +Azure: credentials not found. Run 'az login' or set AZURE_TENANT_ID + AZURE_CLIENT_ID. +GCP: credentials not found. Run 'gcloud auth application-default login' or set GOOGLE_APPLICATION_CREDENTIALS. +GitHub: token not found. Set GITHUB_TOKEN=ghp_... and use --github-org. +Kubernetes: cluster not reachable. Check ~/.kube/config or use --kube-context. +``` + +### 4d. Post-demo suggestions + +After `_print_demo_table()` completes, print: + +``` + Try it on your infrastructure: + nhinsight scan --aws Scan AWS IAM + nhinsight scan --all Scan all available providers + nhinsight scan --aws --explain Add AI-powered explanations +``` + +### 4e. `--attack-paths` help text + +**Current:** +``` +Run identity attack path analysis +``` + +**Improved:** +``` +Trace privilege chains across providers (e.g. K8s SA → IRSA → AWS admin) +``` + +--- + +## 5. Packaging / PyPI Improvements + +### 5a. pyproject.toml changes + +**Description** (line 8): +``` +Current: "Non-Human Identity discovery for cloud infrastructure" +Better: "Discover risky non-human identities and privilege paths across AWS, Azure, GCP, GitHub, and Kubernetes" +``` + +**Add classifiers:** +```toml +"Environment :: Console", +"Operating System :: OS Independent", +"Typing :: Typed", +``` + +**Add `Documentation` URL:** +```toml +Documentation = "https://github.com/cvemula1/NHInsight#quick-start" +Changelog = "https://github.com/cvemula1/NHInsight/releases" +``` + +### 5b. setup.py + +The current `setup.py` is a 3-line shim. It's only needed for `pip install -e .` on older pip. **Keep it** — it's harmless and avoids edge-case breakage. No change needed. + +### 5c. README as long_description + +Already set via `readme = "README.md"` in pyproject.toml. The rewritten README with collapsible `
` sections will render well on PyPI (PyPI supports `
` in markdown since 2023). No change needed. + +### 5d. Release checklist for PyPI + +``` +1. Bump version in nhinsight/__init__.py + pyproject.toml +2. Update CHANGELOG.md (if exists) +3. git tag v0.1.x +4. git push origin v0.1.x (triggers release.yml) +5. Verify PyPI page renders correctly +6. Verify Docker Hub image tagged +7. Create GitHub Release with notes +``` + +--- + +## 6. Demo Improvements + +### 6a. Post-demo footer + +Add after the combined summary in `_print_demo_table()` (after line 1092): + +```python +# Post-demo suggestions +print(f"\n {BOLD}Try it on your infrastructure:{RESET}") +print(f" nhinsight scan --aws Scan AWS IAM") +print(f" nhinsight scan --all Scan all providers") +print(f" nhinsight scan --aws --explain AI-powered explanations") +print(f" nhinsight scan --all -f sarif SARIF for GitHub Security tab") +print() +``` + +### 6b. Demo output is already good + +The demo data covers all 5 providers with realistic findings. The combined summary with urgent fixes is solid. The scorecard and NIST compliance sections are impressive for screenshots. + +**One minor tweak:** The demo header could include a timing line to show speed: + +After line 1016, add: +```python +print(f" {DIM}Scanned 5 providers in 0.3s{RESET}\n") +``` + +This is cosmetic but reinforces "fast tool" positioning. + +### 6c. Demo attack paths + +The demo currently does NOT run attack paths. To make the demo show attack path analysis (which is the differentiator), add `--attack-paths` support to the demo command: + +In `_build_parser()`, add to demo_p: +```python +demo_p.add_argument("--attack-paths", action="store_true", + help="Include attack path analysis in demo") +``` + +In `main()` demo handler, after printing: +```python +if getattr(args, "attack_paths", False): + from nhinsight.analyzers.attack_paths import analyze_attack_paths + from nhinsight.core.output import print_attack_paths + ap_result = analyze_attack_paths(result.identities) + print_attack_paths(ap_result) +``` + +--- + +## 7. Output Clarity Improvements + +### 7a. HIGH severity icon + +**Current:** Both CRITICAL and HIGH use 🔴. This makes them visually identical. + +**Fix in output.py line 31:** +```python +Severity.HIGH: "🟠", +``` + +This matches the README example output and is the standard convention. + +### 7b. Severity label formatting + +**Current (line 43):** +```python +out.write(f" {color}{icon} {label} ({len(identities)}){RESET}\n") +``` + +This prints `🔴 CRITICAL (3)` which is clear. No change needed. + +### 7c. Identity type display + +**Current (line 47):** +```python +out.write(f" {DIM}({ident.identity_type.value}, {ident.provider.value}){RESET}\n") +``` + +This shows `(iam_user, aws)` — uses enum values with underscores. Could be prettier but it's consistent with JSON/SARIF output. **Leave as-is** for now. Changing display names is a v0.2 polish. + +### 7d. Summary line + +**Current (line 109):** +```python +out.write(f" Summary: {len(nhis)} NHIs") +``` + +Good. No change. + +### 7e. Attack path output + +**Current (line 400):** +```python +blast_str = f" blast: {path.blast_radius:.0f}/100" +``` + +**Improved wording:** +```python +blast_str = f" risk: {path.blast_radius:.0f}/100" +``` + +"risk" is more intuitive than "blast" for most users. The blast_radius internal name can stay. + +--- + +## 8. Attack Path Wording Improvements + +### 8a. Better `--attack-paths` help text + +**Current (line 111):** +```python +help="Run identity attack path analysis" +``` + +**Improved:** +```python +help="Trace privilege escalation chains across providers (e.g. K8s SA → cloud admin)" +``` + +### 8b. README attack path section + +The current section is good. One improvement — add 3 concrete example chains: + +```markdown +Example chains NHInsight detects: +- **K8s → AWS**: ServiceAccount → IRSA role → IAM role with AdministratorAccess +- **K8s → GCP**: ServiceAccount → Workload Identity → SA with roles/owner +- **GitHub → AWS**: Deploy key → workflow → OIDC → IAM role with S3FullAccess +``` + +### 8c. Attack path output header + +**Current (line 371):** +```python +out.write(f" {BOLD}Identity Attack Path Analysis{RESET}\n") +``` + +**Improved:** +```python +out.write(f" {BOLD}Privilege Escalation Paths{RESET}\n") +``` + +"Privilege Escalation Paths" is more immediately understood than "Identity Attack Path Analysis." + +### 8d. Path display label + +**Current (line 398):** +```python +out.write(f" {color}{icon} {path.id}{RESET}") +``` + +Shows `AP-001` which is opaque. Add the description: + +```python +out.write(f" {color}{icon} {path.id} — {path.description}{RESET}") +``` + +The `description` field already contains `entry → target (cross-system: k8s → aws)`. + +--- + +## 9. Minimal Patch Plan + +**4 files. ~80 lines changed. One PR.** + +### Priority 1 — Highest UX impact (do first) + +| File | Change | Lines | +|------|--------|-------| +| `README.md` | Restructure: Quick Start first, collapsible details | Full rewrite | +| `cli.py:1092` | Add post-demo "try next" suggestions | +8 lines | +| `cli.py:202-204` | Better no-provider error with examples | +8 lines | +| `cli.py:1132` | Friendly no-args hint | +4 lines | + +### Priority 2 — Polish (do second) + +| File | Change | Lines | +|------|--------|-------| +| `output.py:31` | HIGH icon: 🔴 → 🟠 | 1 line | +| `output.py:371` | Header: "Privilege Escalation Paths" | 1 line | +| `output.py:400` | "blast" → "risk" label | 1 line | +| `output.py:398` | Show path description alongside ID | 1 line | +| `cli.py:111` | Better `--attack-paths` help text | 1 line | +| `pyproject.toml:8` | Better PyPI description | 1 line | +| `pyproject.toml:14-27` | Add classifiers + URLs | +5 lines | + +### Priority 3 — Nice-to-have (do if time) + +| File | Change | Lines | +|------|--------|-------| +| `cli.py:126` | Add `--attack-paths` flag to demo command | +3 lines | +| `cli.py:1113` | Handle demo `--attack-paths` | +5 lines | +| `cli.py:216,227,240,252,264` | Improve auth error messages | 5 lines | + +### What can wait until v0.2 + +- Identity type display names (e.g. `iam_user` → `IAM User`) +- Provider badges in terminal output +- Interactive mode / TUI +- Separate docs/ folder with detailed guides +- Progress spinner during scans +- `nhinsight init` command for first-time setup + +--- + +## 10. Optional Later Improvements + +These are good ideas that don't belong in this patch: + +1. **`nhinsight init`** — interactive first-run wizard that checks which providers are configured +2. **`nhinsight explain AP-001`** — explain a specific attack path in plain English using LLM +3. **Progress indicator** — show a spinner or progress bar during long scans +4. **`--quiet` flag** — suppress everything except summary line (for CI/CD) +5. **`--fail-on critical`** — exit non-zero if critical findings exist (for CI gates) +6. **GitHub Actions template** — `.github/workflows/nhinsight.yml` users can copy +7. **Separate docs site** — move auth/config/risk-codes to a mkdocs or docusaurus site +8. **Terminal width detection** — adapt output formatting to terminal width +9. **Color detection** — disable ANSI when piping to file (currently always colored) +10. **Completion scripts** — bash/zsh/fish completions for CLI flags + +None of these are urgent. The 9-point patch plan above is the right next step. + +--- + +*Generated for NHInsight v0.1.0 — practical patches, no platform thinking.* diff --git a/README.md b/README.md index 36d5261..3a68a94 100644 --- a/README.md +++ b/README.md @@ -2,145 +2,39 @@ # 🔍 NHInsight -**Find and fix risky non-human identities across cloud, Kubernetes, and GitHub** - -*Open-source CLI for NHI discovery, risk analysis, and attack path detection* +**Discover risky non-human identities and privilege paths across AWS, Azure, GCP, GitHub, and Kubernetes.** [![CI](https://github.com/cvemula1/NHInsight/actions/workflows/ci.yml/badge.svg)](https://github.com/cvemula1/NHInsight/actions/workflows/ci.yml) -[![Python](https://img.shields.io/pypi/pyversions/nhinsight?logo=python&logoColor=white)](https://pypi.org/project/nhinsight/) [![PyPI](https://img.shields.io/pypi/v/nhinsight?color=blue&logo=pypi&logoColor=white)](https://pypi.org/project/nhinsight/) [![Docker](https://img.shields.io/docker/v/chvemula/nhinsight?label=docker&logo=docker&logoColor=white&sort=semver)](https://hub.docker.com/r/chvemula/nhinsight) [![License](https://img.shields.io/github/license/cvemula1/NHInsight?color=green)](LICENSE) [![GitHub stars](https://img.shields.io/github/stars/cvemula1/NHInsight?style=social)](https://github.com/cvemula1/NHInsight) -[![AWS](https://img.shields.io/badge/AWS-IAM-FF9900?logo=amazonaws&logoColor=white)](#aws) -[![Azure](https://img.shields.io/badge/Azure-Entra_ID-0078D4?logo=microsoftazure&logoColor=white)](#azure) -[![GCP](https://img.shields.io/badge/GCP-IAM-4285F4?logo=googlecloud&logoColor=white)](#gcp) -[![GitHub](https://img.shields.io/badge/GitHub-Org-181717?logo=github&logoColor=white)](#github) -[![Kubernetes](https://img.shields.io/badge/Kubernetes-RBAC-326CE5?logo=kubernetes&logoColor=white)](#kubernetes) - ---- - -> 🎨 **We need a logo!** If you're a designer or have ideas, open an issue with the tag `logo` — we'd love your input. See [#1 Logo Discussion](https://github.com/cvemula1/NHInsight/issues/1). - -## The Problem - -Non-human identities outnumber humans **45:1** in most orgs. They're the service accounts with admin privs created 3 years ago by someone who left, the access keys nobody rotated, the deploy keys nobody tracks. Most major cloud breaches in recent years traced back to compromised non-human identities. - -Enterprise NHI tools charge **$50K+/year**. NHInsight does it for free. - -## Installation +## Quick Start ```bash -# Core (AWS only) pip install nhinsight - -# With specific providers -pip install nhinsight[azure] # + Azure AD / Entra ID -pip install nhinsight[gcp] # + GCP IAM -pip install nhinsight[github] # + GitHub -pip install nhinsight[kubernetes] # + Kubernetes -pip install nhinsight[gcp,kubernetes] # mix and match - -# Everything (all 5 providers + AI explanations) -pip install nhinsight[all] - -# From source (development) -git clone https://github.com/cvemula1/NHInsight.git -cd NHInsight -pip install -e ".[all,dev]" -``` - -> **Note:** AWS (`boto3`) is included by default. All other providers are optional — install only what you need, or use `[all]` to get everything. - -### Docker - -```bash -# Build -docker build -t nhinsight . - -# Run demo -docker run --rm nhinsight demo - -# Scan AWS (pass credentials via env vars) -docker run --rm \ - -e AWS_ACCESS_KEY_ID \ - -e AWS_SECRET_ACCESS_KEY \ - -e AWS_DEFAULT_REGION \ - nhinsight scan --aws - -# Scan GCP (mount ADC credentials) -docker run --rm \ - -e GCP_PROJECT=my-project \ - -v ~/.config/gcloud:/root/.config/gcloud:ro \ - nhinsight scan --gcp - -# Scan Azure -docker run --rm \ - -e AZURE_TENANT_ID \ - -e AZURE_CLIENT_ID \ - -e AZURE_CLIENT_SECRET \ - -e AZURE_SUBSCRIPTION_ID \ - nhinsight scan --azure - -# Scan Kubernetes (mount kubeconfig) -docker run --rm \ - -v ~/.kube/config:/root/.kube/config:ro \ - nhinsight scan --k8s - -# Scan GitHub -docker run --rm \ - -e GITHUB_TOKEN \ - nhinsight scan --github --github-org acme-corp - -# Multi-provider + JSON output -docker run --rm \ - -e AWS_ACCESS_KEY_ID \ - -e AWS_SECRET_ACCESS_KEY \ - -e GCP_PROJECT=my-project \ - -v ~/.config/gcloud:/root/.config/gcloud:ro \ - nhinsight scan --aws --gcp --attack-paths -f json +nhinsight demo ``` -## Quick Start +Scan a real environment: ```bash -# See a demo with sample data (no credentials needed) -nhinsight demo - -# Scan your AWS account nhinsight scan --aws - -# Scan multiple providers at once -nhinsight scan --aws --gcp --k8s --attack-paths - -# Scan everything available nhinsight scan --all --attack-paths - -# AI-powered explanations -export OPENAI_API_KEY=sk-... -nhinsight scan --aws --explain - -# Output as JSON or SARIF (for GitHub Security tab) -nhinsight scan --aws --format json -nhinsight scan --all --format sarif -o results.sarif ``` -## Demo Output +Or use Docker: +```bash +docker run --rm chvemula/nhinsight demo ``` - NHInsight — Non-Human Identity Report (demo) - ┌──────────────────────────────────────────────────────────┐ - │ AWS IAM — Account: 123456789012 │ - │ Azure AD — Tenant: acme-corp.onmicrosoft.com │ - │ GCP IAM — Project: my-project │ - │ GitHub — Org: acme-corp │ - │ Kubernetes — Cluster: prod-cluster │ - └──────────────────────────────────────────────────────────┘ +## Example Output +``` 🔴 CRITICAL — deploy-bot (iam_user, aws) │ Has AdministratorAccess policy attached @@ -150,115 +44,98 @@ nhinsight scan --all --format sarif -o results.sarif 🔴 CRITICAL — aks-cluster-sp (azure_sp, azure) │ SP has Contributor at subscription scope - 🔴 HIGH — terraform-deployer/key:abc123de (gcp_sa_key, gcp) + � HIGH — terraform-deployer/key:abc123de (gcp_sa_key, gcp) │ SA key is 400 days old (max 365) - ──────────────────────────────────────────────────────────── - Summary: 25+ NHIs across 5 providers + Summary: 25+ risky non-human identities across 5 providers ``` -## Providers - -| Provider | Status | What It Scans | -|----------|--------|---------------| -| **AWS IAM** | ✅ | Users, roles, access keys, policies, MFA, console access, trust relationships | -| **Azure AD / Entra ID** | ✅ | Service principals, managed identities, app secrets/certs, RBAC role assignments | -| **GCP IAM** | ✅ | Service accounts, SA keys (user-managed), project IAM bindings | -| **GitHub** | ✅ | Apps, deploy keys, org webhooks, repo webhooks, permissions | -| **Kubernetes** | ✅ | ServiceAccounts, RBAC, Secrets, Deployments, IRSA/WI annotations | - ## What It Finds -**34 risk codes** across 6 categories: - -### AWS +- Overprivileged service accounts and roles (admin, owner, contributor) +- Stale or unrotated credentials (access keys, SA keys, app secrets) +- Wildcard trust relationships and open role assumptions +- Dangerous Kubernetes service account bindings (cluster-admin, legacy tokens) +- Risky GitHub deploy keys, app permissions, and admin-scoped tokens +- Cross-cloud attack paths from entry points to privileged resources -| Risk | Code | Severity | -|------|------|----------| -| Admin/PowerUser policy attached | `AWS_ADMIN_ACCESS` | Critical | -| Role trust allows any principal (`*`) | `AWS_WILDCARD_TRUST` | Critical | -| Access key never rotated (>365 days) | `AWS_KEY_NOT_ROTATED` | High | -| Console access without MFA | `AWS_NO_MFA` | High | -| Inactive key not deleted | `AWS_KEY_INACTIVE` | Medium | +**34 risk checks** across 5 providers. [See all risk codes](#risk-codes). -### Azure +## Supported Providers -| Risk | Code | Severity | -|------|------|----------| -| SP/MI with Owner/Contributor at subscription scope | `AZURE_SP_DANGEROUS_ROLE` | Critical | -| Disabled SP still has RBAC bindings | `AZURE_SP_DISABLED_WITH_ROLES` | Medium | -| App credential expired | `AZURE_CRED_EXPIRED` | High | -| App credential expiring within 30 days | `AZURE_CRED_EXPIRING_SOON` | Medium | -| Secret not rotated (>365 days) | `AZURE_SECRET_NOT_ROTATED` | High | +- **AWS** — IAM users, roles, access keys, policies, MFA, trust relationships +- **Azure** — Service principals, managed identities, app secrets/certs, RBAC +- **GCP** — Service accounts, SA keys, project IAM bindings +- **GitHub** — Apps, deploy keys, webhooks, permissions +- **Kubernetes** — ServiceAccounts, RBAC, Secrets, IRSA/Workload Identity -### GCP +## Key Capabilities -| Risk | Code | Severity | -|------|------|----------| -| SA with roles/owner or roles/editor | `GCP_SA_DANGEROUS_ROLE` | Critical | -| SA with compute.admin, storage.admin, etc. | `GCP_SA_DANGEROUS_ROLE` | High | -| Disabled SA still has IAM bindings | `GCP_SA_DISABLED_WITH_ROLES` | Medium | -| GCP-managed SA with dangerous roles | `GCP_MANAGED_SA_OVERPRIVILEGED` | High | -| SA key not rotated (>365 days) | `GCP_KEY_NOT_ROTATED` | High | -| SA key expired | `GCP_KEY_EXPIRED` | High | -| SA key expiring within 30 days | `GCP_KEY_EXPIRING_SOON` | Medium | - -### Kubernetes - -| Risk | Code | Severity | -|------|------|----------| -| SA bound to cluster-admin | `K8S_CLUSTER_ADMIN` | Critical | -| Legacy long-lived SA token secret | `K8S_LEGACY_SA_TOKEN` | High | -| Automount token on privileged SA | `K8S_AUTOMOUNT_PRIVILEGED` | High | -| Default SA in use / Orphaned SA / No WI | `K8S_*` | Medium | +- **Attack path analysis** — cross-cloud identity chains with blast radius scoring +- **NIST SP 800-53 scoring** — compliance mapping with letter grades +- **IGA governance scores** — ownership, rotation, least-privilege hygiene +- **AI explanations** — optional OpenAI-powered risk summaries (`--explain`) +- **SARIF output** — plug into GitHub Security tab or CI/CD (`-f sarif`) +- **Zero agents** — read-only API calls, nothing installed in your infra -### GitHub +## Install Options -| Risk | Code | Severity | -|------|------|----------| -| Token with admin scope | `GH_ADMIN_SCOPE` | High | -| App with dangerous write perms | `GH_APP_DANGEROUS_PERMS` | High | -| Deploy key with write access | `GH_DEPLOY_KEY_WRITE` | Medium | +```bash +pip install nhinsight # Core (AWS included by default) +pip install nhinsight[all] # All 5 providers + AI explanations +pip install nhinsight[azure] # Just Azure +pip install nhinsight[gcp,k8s] # Mix and match +``` -### Universal +
+Docker examples -| Risk | Code | Severity | -|------|------|----------| -| Identity unused for 90+ days | `STALE_IDENTITY` | Medium | -| No owner or creator identified | `NO_OWNER` | Low | +```bash +# Scan AWS +docker run --rm -e AWS_ACCESS_KEY_ID -e AWS_SECRET_ACCESS_KEY \ + chvemula/nhinsight scan --aws -## Features +# Scan Azure +docker run --rm \ + -e AZURE_TENANT_ID -e AZURE_CLIENT_ID \ + -e AZURE_CLIENT_SECRET -e AZURE_SUBSCRIPTION_ID \ + chvemula/nhinsight scan --azure -- **5 providers** — AWS, Azure, GCP, GitHub, Kubernetes -- **34 risk checks** — overprivileged, stale, unrotated, ownerless, misconfigured -- **Identity graph** — maps relationships between identities across providers -- **Attack path analysis** — traces entry points to privileged resources, including cross-system chains -- **NIST SP 800-53 scoring** — maps findings to NIST controls, letter grades -- **IGA governance scores** — ownership, rotation, least-privilege, lifecycle hygiene -- **Human vs machine classification** — rule-based, no ML required -- **AI explanations** — optional OpenAI-powered plain-English risk summaries -- **SARIF output** — plug into GitHub Security tab or any SAST tool -- **Zero agents** — API reads only, installs nothing in your infra -- **Runs locally** — no cloud dependency, no telemetry, no phone-home +# Scan GCP +docker run --rm -e GCP_PROJECT=my-project \ + -v ~/.config/gcloud:/root/.config/gcloud:ro \ + chvemula/nhinsight scan --gcp -## Attack Path Analysis +# Scan Kubernetes +docker run --rm -v ~/.kube/config:/root/.kube/config:ro \ + chvemula/nhinsight scan --k8s -Discover chains of identities and permissions that lead to privileged resources: +# Scan GitHub +docker run --rm -e GITHUB_TOKEN \ + chvemula/nhinsight scan --github --github-org acme-corp -```bash -nhinsight scan --aws --k8s --gcp --attack-paths +# Multi-provider + JSON +docker run --rm -e AWS_ACCESS_KEY_ID -e AWS_SECRET_ACCESS_KEY \ + -e GCP_PROJECT=my-project -v ~/.config/gcloud:/root/.config/gcloud:ro \ + chvemula/nhinsight scan --aws --gcp --attack-paths -f json ``` -NHInsight builds an identity graph from scan results and traces paths from entry points (keys, tokens, SAs) to privileged targets (admin roles, owner bindings, cluster-admin): - -- **Cross-system paths** — K8s SA → IRSA → AWS admin role, K8s SA → GKE WI → GCP owner -- **Blast radius scoring** — 0–100 composite score based on privilege level, cross-system reach, path length -- **Severity** — Critical / High / Medium / Low based on blast radius -- **Fix guidance** — per-edge remediation recommendations +
## Authentication -NHInsight only needs **read-only** access. It never modifies anything. Each provider uses its standard SDK credential chain — no custom auth, no agents. +NHInsight uses **read-only** access via each provider's standard SDK credentials. No agents, no custom auth. + +| Provider | Quick Auth | +|----------|-----------| +| **AWS** | `aws configure` or env vars or instance role | +| **Azure** | `az login` or service principal env vars | +| **GCP** | `gcloud auth application-default login` or SA key | +| **GitHub** | `export GITHUB_TOKEN=ghp_...` | +| **Kubernetes** | Uses `~/.kube/config` current context | + +
+Detailed auth setup per provider ### AWS @@ -362,23 +239,126 @@ nhinsight scan --k8s nhinsight scan --k8s --kube-context prod --kube-namespace payments ``` -### Multi-Provider +
+ +## Attack Path Analysis -Combine any providers in a single scan: +NHInsight builds an identity graph and traces paths from entry points (keys, tokens, SAs) to privileged targets (admin roles, owner bindings, cluster-admin): ```bash -# Scan AWS + GCP + K8s with attack path analysis -nhinsight scan --aws --gcp --k8s --attack-paths +nhinsight scan --aws --k8s --gcp --attack-paths +``` -# Scan everything available -nhinsight scan --all --attack-paths +Example chains NHInsight detects: + +- **K8s → AWS** — ServiceAccount → IRSA role → IAM role with AdministratorAccess +- **K8s → GCP** — ServiceAccount → Workload Identity → SA with roles/owner +- **GitHub → AWS** — Deploy key → workflow → OIDC → IAM role with S3FullAccess + +Each path includes: +- **Blast radius scoring** — 0–100 composite based on privilege level and cross-system reach +- **Fix guidance** — per-edge remediation recommendations -# Output to SARIF for GitHub Security tab -nhinsight scan --all -f sarif -o results.sarif +### Mermaid Diagrams + +Generate copy-pasteable Mermaid diagrams for PRs, docs, and reviews: + +```bash +# Mermaid output alongside terminal results +nhinsight scan --aws --k8s --mermaid + +# Demo with Mermaid diagrams +nhinsight demo --mermaid + +# Render from saved JSON (for CI pipelines) +nhinsight scan --all --attack-paths -f json -o findings.json +nhinsight graph --input findings.json +nhinsight graph --input findings.json --split # one diagram per path ``` +Example output (paste into any Mermaid-compatible renderer — GitHub, Notion, VS Code): + +```mermaid +flowchart LR + subgraph Kubernetes + sa["prod/deploy-sa"] + end + subgraph AWS + role{{"eks-admin-role"}} + end + sa -->|"IRSA → eks-admin-role"| role + style sa fill:#326CE5,stroke:#1a3a6e,color:#fff + style role fill:#FF9900,stroke:#232F3E,color:#232F3E +``` + +## Risk Codes + +
+All 34 risk codes by provider + +### AWS + +| Risk | Code | Severity | +|------|------|----------| +| Admin/PowerUser policy attached | `AWS_ADMIN_ACCESS` | Critical | +| Role trust allows any principal (`*`) | `AWS_WILDCARD_TRUST` | Critical | +| Access key never rotated (>365 days) | `AWS_KEY_NOT_ROTATED` | High | +| Console access without MFA | `AWS_NO_MFA` | High | +| Inactive key not deleted | `AWS_KEY_INACTIVE` | Medium | + +### Azure + +| Risk | Code | Severity | +|------|------|----------| +| SP/MI with Owner/Contributor at subscription scope | `AZURE_SP_DANGEROUS_ROLE` | Critical | +| Disabled SP still has RBAC bindings | `AZURE_SP_DISABLED_WITH_ROLES` | Medium | +| App credential expired | `AZURE_CRED_EXPIRED` | High | +| App credential expiring within 30 days | `AZURE_CRED_EXPIRING_SOON` | Medium | +| Secret not rotated (>365 days) | `AZURE_SECRET_NOT_ROTATED` | High | + +### GCP + +| Risk | Code | Severity | +|------|------|----------| +| SA with roles/owner or roles/editor | `GCP_SA_DANGEROUS_ROLE` | Critical | +| SA with compute.admin, storage.admin, etc. | `GCP_SA_DANGEROUS_ROLE` | High | +| Disabled SA still has IAM bindings | `GCP_SA_DISABLED_WITH_ROLES` | Medium | +| GCP-managed SA with dangerous roles | `GCP_MANAGED_SA_OVERPRIVILEGED` | High | +| SA key not rotated (>365 days) | `GCP_KEY_NOT_ROTATED` | High | +| SA key expired | `GCP_KEY_EXPIRED` | High | +| SA key expiring within 30 days | `GCP_KEY_EXPIRING_SOON` | Medium | + +### Kubernetes + +| Risk | Code | Severity | +|------|------|----------| +| SA bound to cluster-admin | `K8S_CLUSTER_ADMIN` | Critical | +| Legacy long-lived SA token secret | `K8S_LEGACY_SA_TOKEN` | High | +| Automount token on privileged SA | `K8S_AUTOMOUNT_PRIVILEGED` | High | +| Default SA in use / Orphaned SA / No WI | `K8S_*` | Medium | + +### GitHub + +| Risk | Code | Severity | +|------|------|----------| +| Token with admin scope | `GH_ADMIN_SCOPE` | High | +| App with dangerous write perms | `GH_APP_DANGEROUS_PERMS` | High | +| Deploy key with write access | `GH_DEPLOY_KEY_WRITE` | Medium | + +### Universal + +| Risk | Code | Severity | +|------|------|----------| +| Identity unused for 90+ days | `STALE_IDENTITY` | Medium | +| No owner or creator identified | `NO_OWNER` | Low | + +
+ ## Configuration +
+Environment variables and CLI flags + All settings can be set via environment variables, CLI flags, or both (CLI flags take precedence): | Setting | Env Var | CLI Flag | Default | @@ -399,8 +379,13 @@ All settings can be set via environment variables, CLI flags, or both (CLI flags See [.env.example](.env.example) for a ready-to-copy template. +
+ ## CLI Reference +
+Full CLI flags + ``` nhinsight scan [OPTIONS] Discover and analyze NHIs --aws Scan AWS IAM @@ -429,26 +414,20 @@ nhinsight demo Show demo scan with sample data nhinsight version Show version ``` +
+ ## Development ```bash git clone https://github.com/cvemula1/NHInsight.git cd NHInsight - -# Install with all providers + dev tools -make dev -# or: pip install -e ".[all,dev]" - -# Run tests (151 tests, <1 second) -make test - -# Lint -make lint - -# Run demo (no credentials needed) -make demo +pip install -e ".[all,dev]" +make test # 151 tests, <1 second ``` +
+Makefile targets and architecture + ### Makefile targets | Target | What It Does | @@ -465,7 +444,7 @@ make demo | `make docker-demo` | Run demo in Docker | | `make clean` | Remove build artifacts | -## Architecture +### Architecture ``` nhinsight/ @@ -491,55 +470,19 @@ nhinsight/ └── llm.py # Optional LLM explanations (OpenAI) ``` +
+ ## Roadmap -### v0.1 — Core (shipped) - -- [x] AWS IAM provider -- [x] Azure AD / Entra ID provider -- [x] GCP IAM provider -- [x] GitHub provider -- [x] Kubernetes provider -- [x] Risk analysis (34 checks across 5 providers) -- [x] Human vs machine classification -- [x] NIST SP 800-53 compliance scoring -- [x] IGA governance scoring -- [x] Identity graph + attack path analysis -- [x] LLM explanation layer (OpenAI) -- [x] SARIF output for CI/CD -- [x] Docker support - -### v0.2 — Policy & Intelligence - -- [ ] OPA/Rego policy engine — define custom rules for your org -- [ ] ML-based classification (scikit-learn) — auto-classify human vs machine -- [ ] Anomaly detection (Isolation Forest) — flag unusual identity behavior -- [ ] IAM right-sizing recommendations (LLM + CloudTrail/audit logs) - -### v0.3 — Integrations & Alerting - -- [ ] Slack notifications — send findings to a channel on scan completion -- [ ] Microsoft Teams alerts — webhook-based alerts for critical findings -- [ ] Jira / ServiceNow ticket creation — auto-create tickets for high/critical risks -- [ ] PagerDuty integration — trigger incidents for critical attack paths -- [ ] Webhook support — generic HTTP webhook for custom integrations - -### v0.4 — SOC & Continuous Monitoring - -- [ ] SIEM export (Splunk, Elastic, Sentinel) — ship findings to your SIEM -- [ ] Scheduled scans — cron-based continuous NHI monitoring -- [ ] Drift detection — alert when new NHIs appear or risk scores change -- [ ] Dashboard API — REST API for building custom dashboards -- [ ] GitHub Actions + GitLab CI templates — scan on every PR/merge - -### v0.5 — Auto-Remediation & AI Agent - -- [ ] Auto-rotate credentials — rotate AWS keys, GCP SA keys, Azure secrets with zero downtime -- [ ] Least-privilege policy generation — analyze CloudTrail/audit logs, propose right-sized IAM policies -- [ ] AI remediation agent — agent proposes fix plan, human approves, agent executes and verifies -- [ ] Stale identity cleanup — auto-disable unused identities after configurable grace period -- [ ] PR-based remediation — open pull requests with Terraform/IaC changes for IAM fixes -- [ ] Rollback safety — automatic rollback if a remediation breaks health checks +- [x] **v0.1** — 5 providers, 34 risk checks, attack paths, NIST scoring, SARIF, AI explanations, Docker +- [ ] **v0.2** — OPA/Rego policies, ML classification, anomaly detection, IAM right-sizing +- [ ] **v0.3** — Slack, Teams, Jira, PagerDuty, webhook integrations +- [ ] **v0.4** — SIEM export, scheduled scans, drift detection, dashboard API +- [ ] **v0.5** — Auto-remediation, least-privilege generation, AI agent, PR-based fixes + +## Why NHInsight? + +Non-human identities outnumber humans **45:1** in most orgs. Enterprise NHI tools charge **$50K+/year**. NHInsight does it for free — open source, runs locally, no telemetry. ## Contributing diff --git a/nhinsight/cli.py b/nhinsight/cli.py index 13d61c2..c32ecd1 100644 --- a/nhinsight/cli.py +++ b/nhinsight/cli.py @@ -108,7 +108,9 @@ def _build_parser() -> argparse.ArgumentParser: # Analysis options analysis_group = scan_p.add_argument_group("analysis") analysis_group.add_argument("--attack-paths", action="store_true", - help="Run identity attack path analysis") + help="Trace privilege escalation chains across providers (e.g. K8s SA → cloud admin)") + analysis_group.add_argument("--mermaid", action="store_true", + help="Output attack paths as Mermaid diagrams (implies --attack-paths)") analysis_group.add_argument("--stale-days", type=int, default=90, metavar="N", help="Days without use before flagging as stale (default: 90)") analysis_group.add_argument("--explain", action="store_true", @@ -133,6 +135,10 @@ def _build_parser() -> argparse.ArgumentParser: choices=["table", "json", "sarif", "markdown", "md"], default="table", help="Output format (default: table)") demo_p.add_argument("--output", "-o", metavar="FILE", help="Write output to file") + demo_p.add_argument("--attack-paths", action="store_true", + help="Include attack path analysis in demo output") + demo_p.add_argument("--mermaid", action="store_true", + help="Output attack paths as Mermaid diagrams (implies --attack-paths)") # ── report command ───────────────────────────────────────────── report_p = sub.add_parser( @@ -147,6 +153,20 @@ def _build_parser() -> argparse.ArgumentParser: default="markdown", help="Report format (default: markdown)") report_p.add_argument("--output", "-o", metavar="FILE", help="Write report to file") + # ── graph command ────────────────────────────────────────────── + graph_p = sub.add_parser( + "graph", + help="Render attack path graphs from saved JSON output", + description="Read NHInsight JSON output and render attack path diagrams. " + "Useful for generating Mermaid diagrams from previously saved scans.", + ) + graph_p.add_argument("--input", "-i", metavar="FILE", required=True, + help="Path to NHInsight JSON output file") + graph_p.add_argument("--output", "-o", metavar="FILE", + help="Write Mermaid output to file (default: stdout)") + graph_p.add_argument("--split", action="store_true", + help="Render each attack path as a separate diagram") + # ── version command ──────────────────────────────────────────── sub.add_parser("version", help="Show version") @@ -200,7 +220,13 @@ def _run_scan(args: argparse.Namespace) -> None: providers.append("k8s") if not providers: - print("No providers specified. Use --aws, --azure, --gcp, --github, --k8s, or --all") + print("\n No providers selected.\n") + print(" \033[1mQuick examples:\033[0m") + print(" nhinsight scan --aws Scan AWS IAM") + print(" nhinsight scan --all --attack-paths Scan everything") + print(" nhinsight demo Try with sample data first") + print() + print(" Providers: --aws --azure --gcp --github --k8s --all\n") sys.exit(1) # Collect identities from each provider @@ -295,13 +321,20 @@ def _run_scan(args: argparse.Namespace) -> None: print_result(result, fmt=args.format, out=out) # Attack path analysis (if requested) - if getattr(args, "attack_paths", False) and all_identities: + # --mermaid implies --attack-paths + wants_attack = getattr(args, "attack_paths", False) or getattr(args, "mermaid", False) + if wants_attack and all_identities: from nhinsight.analyzers.attack_paths import analyze_attack_paths from nhinsight.core.output import print_attack_paths ap_result = analyze_attack_paths(all_identities) print_attack_paths(ap_result, out=out) + if getattr(args, "mermaid", False): + from nhinsight.core.mermaid import render_attack_paths, render_summary_table + render_summary_table(ap_result, out=out) + render_attack_paths(ap_result, out=out) + if args.output: out.close() print(f"Results written to {args.output}") @@ -1091,6 +1124,96 @@ def _print_demo_table(result: ScanResult) -> None: print(f" {i}. {line}") print(f" {BOLD}{'─' * 60}{RESET}\n") + # Post-demo suggestions + print(f" {BOLD}Try it on your infrastructure:{RESET}") + print(" nhinsight scan --aws Scan AWS IAM") + print(" nhinsight scan --all Scan all providers") + print(" nhinsight scan --aws --explain AI-powered explanations") + print(" nhinsight scan --all -f sarif SARIF for GitHub Security tab") + print() + + +def _run_graph(args: argparse.Namespace) -> None: + """Load saved JSON output and render Mermaid attack path diagrams.""" + import json + + from nhinsight.analyzers.attack_paths import ( + AttackPath, + AttackPathResult, + AttackPathStep, + ) + from nhinsight.core.mermaid import ( + render_attack_paths, + render_attack_paths_individual, + render_summary_table, + ) + from nhinsight.core.models import Severity + + input_path = args.input + try: + with open(input_path) as f: + data = json.load(f) + except FileNotFoundError: + print(f"Error: file not found: {input_path}") + sys.exit(1) + except json.JSONDecodeError as e: + print(f"Error: invalid JSON in {input_path}: {e}") + sys.exit(1) + + # Reconstruct AttackPathResult from JSON + # Support both full scan output (with "attack_paths" key) and direct AP output + ap_data = data if "paths" in data else data.get("attack_paths", data) + + if "paths" not in ap_data: + print("Error: no attack path data found in JSON. " + "Run scan with --attack-paths -f json first.") + sys.exit(1) + + sev_map = {s.value: s for s in Severity} + + paths = [] + for p in ap_data["paths"]: + steps = [ + AttackPathStep( + node_id=s["node_id"], + node_label=s["node_label"], + node_type=s["node_type"], + provider=s["provider"], + edge_type=s.get("edge_type"), + edge_label=s.get("edge_label", ""), + ) + for s in p.get("steps", []) + ] + paths.append(AttackPath( + id=p.get("id", "AP-???"), + steps=steps, + severity=sev_map.get(p.get("severity", "medium"), Severity.MEDIUM), + blast_radius=float(p.get("blast_radius", 0)), + cross_system=p.get("cross_system", False), + description=p.get("description", ""), + recommendation=p.get("recommendation", ""), + )) + + ap_result = AttackPathResult( + paths=paths, + graph_stats=ap_data.get("graph", {}), + ) + + out = sys.stdout + if args.output: + out = open(args.output, "w") + + render_summary_table(ap_result, out=out) + + if getattr(args, "split", False): + render_attack_paths_individual(ap_result, out=out) + else: + render_attack_paths(ap_result, out=out) + + if args.output: + out.close() + print(f"Mermaid output written to {args.output}") + def _output_result(result: ScanResult, fmt: str, output_path: str | None) -> None: """Output a ScanResult in the requested format, optionally to a file.""" @@ -1118,6 +1241,25 @@ def main(): _print_demo_table(result) else: _output_result(result, fmt, output_path) + # Attack path analysis for demo (--attack-paths or --mermaid) + wants_attack = getattr(args, "attack_paths", False) or getattr(args, "mermaid", False) + if wants_attack: + from nhinsight.analyzers.attack_paths import analyze_attack_paths + ap_result = analyze_attack_paths(result.identities) + out = sys.stdout + if output_path: + out = open(output_path, "a") + if not getattr(args, "mermaid", False): + from nhinsight.core.output import print_attack_paths + print_attack_paths(ap_result, out=out) + if getattr(args, "mermaid", False): + from nhinsight.core.mermaid import render_attack_paths, render_summary_table + render_summary_table(ap_result, out=out) + render_attack_paths(ap_result, out=out) + if output_path: + out.close() + elif args.command == "graph": + _run_graph(args) elif args.command == "report": if getattr(args, "demo", False): result = _build_demo_data() @@ -1131,6 +1273,10 @@ def main(): print(f"nhinsight {__version__}") else: parser.print_help() + print("\n \033[1mQuick start:\033[0m") + print(" nhinsight demo # sample data, no credentials needed") + print(" nhinsight scan --aws # scan your AWS account") + print() if __name__ == "__main__": diff --git a/nhinsight/core/mermaid.py b/nhinsight/core/mermaid.py new file mode 100644 index 0000000..aee1982 --- /dev/null +++ b/nhinsight/core/mermaid.py @@ -0,0 +1,211 @@ +# MIT License — Copyright (c) 2026 cvemula1 +# Mermaid diagram renderer for NHInsight attack path analysis + +from __future__ import annotations + +import re +import sys +from typing import TextIO + +from nhinsight.core.models import Severity + +# Provider → Mermaid CSS class color +PROVIDER_STYLES = { + "aws": "fill:#FF9900,stroke:#232F3E,color:#232F3E", + "azure": "fill:#0078D4,stroke:#002050,color:#fff", + "gcp": "fill:#4285F4,stroke:#174EA6,color:#fff", + "github": "fill:#6e40c9,stroke:#3b1f6e,color:#fff", + "kubernetes": "fill:#326CE5,stroke:#1a3a6e,color:#fff", +} + +SEVERITY_STYLES = { + Severity.CRITICAL: "fill:#d32f2f,stroke:#b71c1c,color:#fff", + Severity.HIGH: "fill:#e65100,stroke:#bf360c,color:#fff", + Severity.MEDIUM: "fill:#f9a825,stroke:#f57f17,color:#000", + Severity.LOW: "fill:#1565c0,stroke:#0d47a1,color:#fff", + Severity.INFO: "fill:#2e7d32,stroke:#1b5e20,color:#fff", +} + +SEVERITY_LABELS = { + Severity.CRITICAL: "🔴 CRITICAL", + Severity.HIGH: "🟠 HIGH", + Severity.MEDIUM: "🟡 MEDIUM", + Severity.LOW: "🔵 LOW", + Severity.INFO: "🟢 INFO", +} + + +def _sanitize_id(raw_id: str) -> str: + """Convert a node ID into a Mermaid-safe identifier.""" + return re.sub(r"[^a-zA-Z0-9_]", "_", raw_id) + + +def _sanitize_label(label: str) -> str: + """Escape characters that break Mermaid labels.""" + return label.replace('"', "'").replace("<", "‹").replace(">", "›") + + +def render_attack_paths(ap_result, out: TextIO = sys.stdout) -> None: + """Render attack path results as a Mermaid flowchart. + + Produces a single ``flowchart LR`` diagram with all discovered paths. + Nodes are colored by provider; edges carry relationship labels. + """ + paths = ap_result.paths + if not paths: + out.write("```mermaid\nflowchart LR\n NO_PATHS[\"✅ No privilege escalation paths found\"]\n```\n") + return + + lines: list[str] = [] + lines.append("```mermaid") + lines.append("flowchart LR") + + # Collect unique nodes and edges across all paths + seen_nodes: dict[str, tuple[str, str, str]] = {} # sanitized_id → (label, provider, node_type) + seen_edges: list[tuple[str, str, str]] = [] # (src, dst, label) + edge_set: set[str] = set() + path_node_sets: list[tuple[str, list[str]]] = [] # (path_id, [sanitized_ids]) + + for path in paths: + path_nodes = [] + for i, step in enumerate(path.steps): + sid = _sanitize_id(step.node_id) + if sid not in seen_nodes: + seen_nodes[sid] = (step.node_label, step.provider, step.node_type) + path_nodes.append(sid) + + if i > 0: + prev_sid = _sanitize_id(path.steps[i - 1].node_id) + edge_label = step.edge_label or "" + edge_key = f"{prev_sid}→{sid}" + if edge_key not in edge_set: + edge_set.add(edge_key) + seen_edges.append((prev_sid, sid, edge_label)) + + path_node_sets.append((path.id, path_nodes)) + + # Group nodes by provider for subgraphs + by_provider: dict[str, list[str]] = {} + for sid, (label, provider, node_type) in seen_nodes.items(): + by_provider.setdefault(provider, []).append(sid) + + # Emit subgraphs per provider + for provider, node_ids in sorted(by_provider.items()): + provider_label = { + "aws": "AWS", "azure": "Azure", "gcp": "GCP", + "github": "GitHub", "kubernetes": "Kubernetes", + }.get(provider, provider.upper()) + + lines.append(f" subgraph {provider_label}") + for sid in node_ids: + label, _, node_type = seen_nodes[sid] + safe_label = _sanitize_label(label) + # Use different shapes: rounded for identities, hexagon for privileged roles + if "rbac" in node_type or "iam_role" in node_type or "permissions" in node_type: + lines.append(f' {sid}{{{{{safe_label}}}}}') + else: + lines.append(f' {sid}["{safe_label}"]') + lines.append(" end") + + # Emit edges + for src, dst, label in seen_edges: + safe_label = _sanitize_label(label) + if safe_label: + lines.append(f' {src} -->|"{safe_label}"| {dst}') + else: + lines.append(f" {src} --> {dst}") + + # Emit style classes per provider + for provider, node_ids in by_provider.items(): + style = PROVIDER_STYLES.get(provider, "fill:#999,stroke:#666,color:#fff") + for sid in node_ids: + lines.append(f" style {sid} {style}") + + lines.append("```") + + out.write("\n".join(lines)) + out.write("\n") + + +def render_attack_paths_individual(ap_result, out: TextIO = sys.stdout) -> None: + """Render each attack path as its own small Mermaid diagram. + + Useful for PR comments where you want one diagram per finding. + """ + paths = ap_result.paths + if not paths: + out.write("```mermaid\nflowchart LR\n OK[\"✅ No privilege escalation paths found\"]\n```\n") + return + + for path in paths: + sev_label = SEVERITY_LABELS.get(path.severity, path.severity.value.upper()) + out.write(f"\n**{path.id}** — {sev_label} — risk {path.blast_radius:.0f}/100") + if path.cross_system: + out.write(" ⚡ cross-system") + out.write("\n\n") + + lines: list[str] = [] + lines.append("```mermaid") + lines.append("flowchart LR") + + for i, step in enumerate(path.steps): + sid = _sanitize_id(step.node_id) + safe_label = _sanitize_label(step.node_label) + prov = step.provider + + # Shape by node type + if "rbac" in step.node_type or "iam_role" in step.node_type or "permissions" in step.node_type: + lines.append(f' {sid}{{{{{safe_label}
{prov}}}}}') + else: + lines.append(f' {sid}["{safe_label}
{prov}"]') + + # Style + style = PROVIDER_STYLES.get(prov, "fill:#999,stroke:#666,color:#fff") + lines.append(f" style {sid} {style}") + + # Edge to next + if i > 0: + prev_sid = _sanitize_id(path.steps[i - 1].node_id) + edge_label = _sanitize_label(step.edge_label or "") + if edge_label: + lines.append(f' {prev_sid} -->|"{edge_label}"| {sid}') + else: + lines.append(f" {prev_sid} --> {sid}") + + lines.append("```") + out.write("\n".join(lines)) + out.write("\n") + + if path.recommendation: + out.write(f"\n> 💡 {path.recommendation[:200]}\n") + + +def render_summary_table(ap_result, out: TextIO = sys.stdout) -> None: + """Render a markdown summary table of attack paths.""" + paths = ap_result.paths + stats = ap_result.graph_stats + + out.write("## Privilege Escalation Paths\n\n") + out.write(f"Graph: {stats.get('nodes', 0)} nodes, ") + out.write(f"{stats.get('edges', 0)} edges, ") + out.write(f"{stats.get('entry_points', 0)} entry points, ") + out.write(f"{stats.get('privileged_nodes', 0)} privileged\n\n") + + if not paths: + out.write("✅ No privilege escalation paths found.\n") + return + + out.write("| Path | Severity | Risk | Entry → Target | Providers | Fix |\n") + out.write("|------|----------|------|----------------|-----------|-----|\n") + + for path in paths: + sev = path.severity.value.upper() + risk = f"{path.blast_radius:.0f}/100" + entry = path.steps[0].node_label if path.steps else "?" + target = path.steps[-1].node_label if path.steps else "?" + provs = " → ".join(path.providers_involved) + rec = (path.recommendation or "")[:80] + cross = " ⚡" if path.cross_system else "" + out.write(f"| {path.id} | {sev} | {risk} | {entry} → {target}{cross} | {provs} | {rec} |\n") + + out.write("\n") diff --git a/nhinsight/core/output.py b/nhinsight/core/output.py index ae7ae9c..ec18cad 100644 --- a/nhinsight/core/output.py +++ b/nhinsight/core/output.py @@ -28,7 +28,7 @@ SEVERITY_ICONS = { Severity.CRITICAL: "🔴", - Severity.HIGH: "🔴", + Severity.HIGH: "�", Severity.MEDIUM: "🟡", Severity.LOW: "🔵", Severity.INFO: "🟢", @@ -368,7 +368,7 @@ def print_attack_paths(ap_result, out: TextIO = sys.stdout) -> None: stats = ap_result.graph_stats out.write(f"\n {'═' * 56}\n") - out.write(f" {BOLD}Identity Attack Path Analysis{RESET}\n") + out.write(f" {BOLD}Privilege Escalation Paths{RESET}\n") out.write(f" {'─' * 56}\n\n") out.write(f" Graph: {stats.get('nodes', 0)} nodes, ") @@ -395,9 +395,9 @@ def print_attack_paths(ap_result, out: TextIO = sys.stdout) -> None: color = SEVERITY_COLORS.get(sev, RESET) icon = SEVERITY_ICONS.get(sev, "⚪") - out.write(f" {color}{icon} {path.id}{RESET}") + out.write(f" {color}{icon} {path.id} — {path.description}{RESET}") out.write(f" {BOLD}{sev.value.upper()}{RESET}") - blast_str = f" blast: {path.blast_radius:.0f}/100" + blast_str = f" risk: {path.blast_radius:.0f}/100" out.write(f" {DIM}{blast_str}{RESET}") if path.cross_system: out.write(f" {CYAN}⚡ cross-system{RESET}") diff --git a/pyproject.toml b/pyproject.toml index e0a2b17..ba7fd71 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -5,7 +5,7 @@ build-backend = "setuptools.build_meta" [project] name = "nhinsight" version = "0.1.0" -description = "Non-Human Identity discovery for cloud infrastructure" +description = "Discover risky non-human identities and privilege paths across AWS, Azure, GCP, GitHub, and Kubernetes" readme = "README.md" license = {text = "MIT"} requires-python = ">=3.9" @@ -13,9 +13,11 @@ authors = [{name = "cvemula1", email = "cvemula1@users.noreply.github.com"}] keywords = ["security", "identity", "nhi", "aws", "gcp", "azure", "kubernetes", "github", "devsecops"] classifiers = [ "Development Status :: 3 - Alpha", + "Environment :: Console", "Intended Audience :: Developers", "Intended Audience :: System Administrators", "License :: OSI Approved :: MIT License", + "Operating System :: OS Independent", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.9", "Programming Language :: Python :: 3.10", @@ -24,6 +26,7 @@ classifiers = [ "Programming Language :: Python :: 3.13", "Topic :: Security", "Topic :: System :: Systems Administration", + "Typing :: Typed", ] dependencies = [ "boto3>=1.28", @@ -48,7 +51,9 @@ nhinsight = "nhinsight.cli:main" [project.urls] Homepage = "https://github.com/cvemula1/NHInsight" +Documentation = "https://github.com/cvemula1/NHInsight#quick-start" Repository = "https://github.com/cvemula1/NHInsight" +Changelog = "https://github.com/cvemula1/NHInsight/releases" Issues = "https://github.com/cvemula1/NHInsight/issues" [tool.setuptools.packages.find] diff --git a/tests/test_cli.py b/tests/test_cli.py index 86223ca..6280cd9 100644 --- a/tests/test_cli.py +++ b/tests/test_cli.py @@ -40,4 +40,4 @@ def test_scan_no_provider(): capture_output=True, text=True, ) assert result.returncode == 1 - assert "No providers specified" in result.stdout + assert "No providers selected" in result.stdout diff --git a/tests/test_mermaid.py b/tests/test_mermaid.py new file mode 100644 index 0000000..b33ccb5 --- /dev/null +++ b/tests/test_mermaid.py @@ -0,0 +1,331 @@ +# MIT License — Copyright (c) 2026 cvemula1 +# Tests for Mermaid diagram renderer + +import io +import json +import subprocess +import sys +import tempfile + +from nhinsight.analyzers.attack_paths import ( + AttackPath, + AttackPathResult, + AttackPathStep, + analyze_attack_paths, +) +from nhinsight.core.mermaid import ( + render_attack_paths, + render_attack_paths_individual, + render_summary_table, +) +from nhinsight.core.models import ( + Identity, + IdentityType, + Provider, + Severity, +) + +# ── Helpers ──────────────────────────────────────────────────────────── + + +def _iam_user(name, policies=None, arn=""): + return Identity( + id=f"aws:iam:user:123:{name}", + name=name, + provider=Provider.AWS, + identity_type=IdentityType.IAM_USER, + arn=arn or f"arn:aws:iam::123:user/{name}", + policies=policies or [], + ) + + +def _iam_role(name, policies=None, trusted=None): + return Identity( + id=f"aws:iam:role:123:{name}", + name=name, + provider=Provider.AWS, + identity_type=IdentityType.IAM_ROLE, + arn=f"arn:aws:iam::123:role/{name}", + policies=policies or [], + raw={"trusted_principals": trusted or [], "path": "/"}, + ) + + +def _access_key(user, key_id="AKIA1234"): + return Identity( + id=f"aws:iam:key:123:{key_id}", + name=f"{user}/{key_id}", + provider=Provider.AWS, + identity_type=IdentityType.ACCESS_KEY, + raw={"parent_user": user, "key_id": key_id, "status": "Active"}, + ) + + +def _k8s_sa(ns, name, irsa_arn="", policies=None): + return Identity( + id=f"k8s:sa:ctx:{ns}:{name}", + name=f"{ns}/{name}", + provider=Provider.KUBERNETES, + identity_type=IdentityType.SERVICE_ACCOUNT, + policies=policies or [], + raw={ + "namespace": ns, + "sa_name": name, + "irsa_role_arn": irsa_arn, + "workload_identity_azure": "", + "deployments": [], + "secret_count": 0, + "automount_token": True, + }, + ) + + +def _build_simple_ap_result(): + """Build a minimal AttackPathResult for testing.""" + steps = [ + AttackPathStep( + node_id="k8s:sa:ctx:prod:deploy-sa", + node_label="prod/deploy-sa", + node_type="service_account", + provider="kubernetes", + ), + AttackPathStep( + node_id="aws:iam:role:123:eks-admin", + node_label="eks-admin", + node_type="iam_role", + provider="aws", + edge_type="irsa_maps_to", + edge_label="IRSA → eks-admin", + ), + ] + path = AttackPath( + id="AP-001", + steps=steps, + severity=Severity.CRITICAL, + blast_radius=85.0, + cross_system=True, + description="prod/deploy-sa → eks-admin (cross-system: kubernetes → aws)", + recommendation="Scope the IRSA role to least-privilege.", + ) + return AttackPathResult( + paths=[path], + graph_stats={"nodes": 5, "edges": 4, "entry_points": 2, "privileged_nodes": 1}, + ) + + +# ── render_attack_paths tests ────────────────────────────────────────── + + +def test_render_attack_paths_basic(): + """Basic Mermaid rendering produces valid flowchart.""" + ap_result = _build_simple_ap_result() + out = io.StringIO() + render_attack_paths(ap_result, out=out) + output = out.getvalue() + + assert "```mermaid" in output + assert "flowchart LR" in output + assert "```" in output + assert "deploy_sa" in output or "deploy-sa" in output # sanitized ID + assert "eks_admin" in output or "eks-admin" in output + + +def test_render_attack_paths_empty(): + """Empty result produces 'no paths found' diagram.""" + ap_result = AttackPathResult(paths=[], graph_stats={}) + out = io.StringIO() + render_attack_paths(ap_result, out=out) + output = out.getvalue() + + assert "```mermaid" in output + assert "No privilege escalation paths found" in output + + +def test_render_attack_paths_has_subgraphs(): + """Cross-system paths produce provider subgraphs.""" + ap_result = _build_simple_ap_result() + out = io.StringIO() + render_attack_paths(ap_result, out=out) + output = out.getvalue() + + assert "subgraph AWS" in output + assert "subgraph Kubernetes" in output + + +def test_render_attack_paths_has_styles(): + """Nodes get provider-colored styles.""" + ap_result = _build_simple_ap_result() + out = io.StringIO() + render_attack_paths(ap_result, out=out) + output = out.getvalue() + + assert "fill:#FF9900" in output # AWS orange + assert "fill:#326CE5" in output # K8s blue + + +def test_render_attack_paths_edge_labels(): + """Edges carry relationship labels.""" + ap_result = _build_simple_ap_result() + out = io.StringIO() + render_attack_paths(ap_result, out=out) + output = out.getvalue() + + assert "IRSA" in output + + +# ── render_attack_paths_individual tests ─────────────────────────────── + + +def test_render_individual_basic(): + """Individual rendering produces separate diagrams per path.""" + ap_result = _build_simple_ap_result() + out = io.StringIO() + render_attack_paths_individual(ap_result, out=out) + output = out.getvalue() + + assert "AP-001" in output + assert "CRITICAL" in output + assert "```mermaid" in output + assert "💡" in output # recommendation + + +def test_render_individual_empty(): + """Empty result produces 'no paths' diagram.""" + ap_result = AttackPathResult(paths=[], graph_stats={}) + out = io.StringIO() + render_attack_paths_individual(ap_result, out=out) + output = out.getvalue() + + assert "No privilege escalation paths found" in output + + +# ── render_summary_table tests ───────────────────────────────────────── + + +def test_summary_table_basic(): + """Summary table includes headers and path data.""" + ap_result = _build_simple_ap_result() + out = io.StringIO() + render_summary_table(ap_result, out=out) + output = out.getvalue() + + assert "## Privilege Escalation Paths" in output + assert "| Path |" in output + assert "AP-001" in output + assert "CRITICAL" in output + assert "85/100" in output + assert "⚡" in output # cross-system + + +def test_summary_table_empty(): + """Empty result shows no-paths message.""" + ap_result = AttackPathResult(paths=[], graph_stats={"nodes": 0, "edges": 0}) + out = io.StringIO() + render_summary_table(ap_result, out=out) + output = out.getvalue() + + assert "No privilege escalation paths found" in output + + +# ── Integration with real attack path analysis ───────────────────────── + + +def test_mermaid_from_real_analysis(): + """Render Mermaid from a real analyze_attack_paths result.""" + role = _iam_role("admin", policies=["AdministratorAccess"]) + sa = _k8s_sa("prod", "app", irsa_arn=role.arn) + + ap_result = analyze_attack_paths([role, sa]) + + out = io.StringIO() + render_attack_paths(ap_result, out=out) + output = out.getvalue() + + assert "```mermaid" in output + assert "flowchart LR" in output + + +def test_mermaid_from_multi_provider(): + """Mermaid renders correctly with multiple providers.""" + role = _iam_role("eks-admin", policies=["AdministratorAccess"]) + sa = _k8s_sa("prod", "deploy", irsa_arn=role.arn) + user = _iam_user("deploy-bot", arn="arn:aws:iam::123:user/deploy-bot") + key = _access_key("deploy-bot", "AKIA5678") + + ap_result = analyze_attack_paths([role, sa, user, key]) + + out = io.StringIO() + render_attack_paths(ap_result, out=out) + output = out.getvalue() + + assert "```mermaid" in output + # Should have at least one path + if ap_result.paths: + assert "subgraph" in output + + +# ── CLI integration tests ────────────────────────────────────────────── + + +def test_demo_mermaid_flag(): + """nhinsight demo --mermaid produces Mermaid output.""" + result = subprocess.run( + [sys.executable, "-m", "nhinsight.cli", "demo", "--mermaid"], + capture_output=True, text=True, + ) + assert result.returncode == 0 + assert "```mermaid" in result.stdout or "Privilege Escalation Paths" in result.stdout + + +def test_demo_attack_paths_flag(): + """nhinsight demo --attack-paths produces attack path output.""" + result = subprocess.run( + [sys.executable, "-m", "nhinsight.cli", "demo", "--attack-paths"], + capture_output=True, text=True, + ) + assert result.returncode == 0 + assert "Privilege Escalation Paths" in result.stdout + + +def test_graph_command_with_json(): + """nhinsight graph --input file.json renders Mermaid from saved JSON.""" + ap_result = _build_simple_ap_result() + data = {"attack_paths": ap_result.to_dict()} + + with tempfile.NamedTemporaryFile(mode="w", suffix=".json", delete=False) as f: + json.dump(data, f) + f.flush() + json_path = f.name + + result = subprocess.run( + [sys.executable, "-m", "nhinsight.cli", "graph", "--input", json_path], + capture_output=True, text=True, + ) + assert result.returncode == 0 + assert "```mermaid" in result.stdout + assert "AP-001" in result.stdout + + +def test_graph_command_missing_file(): + """nhinsight graph --input nonexistent.json fails gracefully.""" + result = subprocess.run( + [sys.executable, "-m", "nhinsight.cli", "graph", "--input", "/tmp/nonexistent_nhinsight.json"], + capture_output=True, text=True, + ) + assert result.returncode == 1 + assert "file not found" in result.stdout + + +def test_graph_command_invalid_json(): + """nhinsight graph --input bad.json fails gracefully.""" + with tempfile.NamedTemporaryFile(mode="w", suffix=".json", delete=False) as f: + f.write("not json") + f.flush() + bad_path = f.name + + result = subprocess.run( + [sys.executable, "-m", "nhinsight.cli", "graph", "--input", bad_path], + capture_output=True, text=True, + ) + assert result.returncode == 1 + assert "invalid JSON" in result.stdout