Skip to content

Claude/nxc http module f uj qh#1242

Closed
cybrly wants to merge 5 commits into
Pennyw0rth:mainfrom
cybrly:claude/nxc-http-module-fUjQH
Closed

Claude/nxc http module f uj qh#1242
cybrly wants to merge 5 commits into
Pennyw0rth:mainfrom
cybrly:claude/nxc-http-module-fUjQH

Conversation

@cybrly
Copy link
Copy Markdown

@cybrly cybrly commented May 16, 2026

No description provided.

claude added 5 commits May 16, 2026 15:49
Adds a new `http` protocol so `nxc http <target>` probes the target like
`nxc smb` shows banner/OS — fetches `/`, extracts the page title with
BeautifulSoup, and runs httpx/Wappalyzer-style fingerprints over the
Server header, response headers, cookies, and body to identify the
stack (nginx, apache, IIS, tomcat, WordPress, Jenkins, GitLab, ...).
Supports HTTPS (auto-detected on common SSL ports), HTTP Basic/Digest
auth, custom path/UA/proxy, and persists results to its own SQLite DB.

Also adds `http_services` module that walks ~65 common admin-panel
and sensitive-file paths (phpMyAdmin, Tomcat manager, Spring actuator,
Jenkins, GitLab, Grafana, Kibana, .git/HEAD, .env, ...) and reports
matches with status code and page title.
- plaintext_login no longer reports false-positive auth success on
  cookie/form-auth sites. It now requires the baseline path to issue
  an HTTP 401 with WWW-Authenticate; otherwise it refuses to claim
  validation. Redirects from a protected resource are treated as
  ambiguous instead of success.
- --ssl with the default port 80 now auto-switches to 443.
- Auto-pick HTTP Digest when the server advertises it via
  WWW-Authenticate, regardless of --auth-type default.
- Baseline 404 probe records status/size/content-hash of a random
  non-existent path so the module can detect SPA catch-all 200s
  (where every path returns index.html) and skip false positives.
- Tightened every fingerprint in the protocol and module so they
  match specific markers (titles, JSON keys, header values), not
  bare product names that show up in unrelated content.
- 401/403 responses in the module are reported as "auth-protected"
  hints rather than counted as confirmed service matches.
- Title extraction now strips literal tag-text that BeautifulSoup's
  html.parser leaves behind when a <title> contains nested elements,
  and filters non-printable characters.
- DB host lookup for logged-in relations now filters by port so the
  relation attaches to the right (host, port) row.
- Extra paths passed via -o EXTRA= are now reported unconditionally
  instead of hidden behind SHOW_ALL.
Protocol fixes:
- urllib3.disable_warnings() now fires only when --no-verify is set,
  instead of globally silencing warnings at module import.
- IPv6 link-local zone-ids (fe80::1%eth0) are percent-encoded per
  RFC 6874 before being placed in URL host parts.
- SSL/TLS handshake errors are surfaced at fail level with a hint
  about --no-verify, instead of being lost at debug level.
- Response bodies are read via streaming with a configurable cap
  (--max-body-size, default 256 KiB) to prevent DoS on hostile
  servers.
- The post-redirect final URL is recorded as self.final_url and
  shown in print_host_info when it differs from the original.
- WWW-Authenticate header parsing now splits on whitespace AND comma
  so headers like "Negotiate, Basic realm=x" yield "Negotiate", not
  "Negotiate,".
- Scheme/SSL decision is cached once in _resolve_scheme() rather
  than recomputed on every URL build.
- Optional NTLM auth via requests-ntlm when --auth-type ntlm is set
  (with graceful fallback to Basic if the package is missing).
- build_url() promoted to public API; modules no longer need to
  reach into private methods.

Module fixes:
- Entries that share the same path are grouped, so we make one
  request per unique URL and run every matching signature against
  the same response (~64 requests down to ~49).
- Probes run concurrently via ThreadPoolExecutor (CONCURRENCY
  option, default 5) instead of serially.
- 401/403 responses dedupe per (path, status): a site-wide auth
  wall no longer floods the log with ~50 "auth-protected" lines.
- Each entry now accepts constraints (content_type_starts_with,
  not_html, empty_body) that gate matches. e.g. .env requires the
  response not be text/html; Spring actuator endpoints require an
  application/* content-type; MinIO requires Content-Length: 0.
- Tightened Tomcat-Manager, Jenkins-Login, and Spring-Health
  signatures to specific markers that don't false-positive on
  unrelated pages.
- Baseline tolerance: small baselines (<1 KB) now require an exact
  hash match. The previous 5% size delta was too forgiving and
  would treat unrelated short pages as catch-all responses.
- Module rate-limit via DELAY option.

Database/navigator:
- Adds a probes table that records confirmed service matches
  (host_id, path, label, status, title), with insert-or-skip
  dedup on (host, path, label).
- nxcdb http gets a `probes [host_id|filter]` command.
Correctness:
- _content_hash uses hashlib.md5(..., usedforsecurity=False) so it
  runs on FIPS-enforcing Python builds. Falls back to bare md5 on
  Python <3.9 where the kwarg doesn't exist.
- add_probe atomicity: UNIQUE constraint on (hostid, path, label)
  + ON CONFLICT DO NOTHING. The previous app-level lock would have
  deadlocked because BaseDB.db_execute already holds self.lock.
- SERVER_FINGERPRINTS use \b word boundaries so "apache" no longer
  matches XApacheCompat/1.0 etc.

Performance:
- _extract_title takes a fast regex path; BeautifulSoup is the
  fallback only when the regex misses (~10x faster on typical
  pages, ~100x on large ones).
- Module caches the host_id once at on_login instead of calling
  get_hosts() per match.
- requests-ntlm import is cached in a module-level sentinel so we
  don't re-import per credential attempt.
- urllib3.disable_warnings() runs at most once per process via a
  module-level flag, not per session build.
- requests.Session has an HTTPAdapter mounted with
  pool_maxsize/pool_connections matched to expected concurrency so
  parallel probe workers can reuse pooled TCP connections.

Cleanup:
- Removed dead branch in _probe_path_group ("not any(entries)"
  could never fire because _group_by_path doesn't yield empty
  groups).
- add_probe failures are now debug-logged instead of silently
  swallowed by contextlib.suppress.
- _decode no longer calls response.apparent_encoding, which under
  stream=True+close() raises "content already consumed" because
  apparent_encoding triggers a read of response.content. Falls back
  to "utf-8" when response.encoding (from Content-Type) is unset.
  Caught during real end-to-end testing — the mock-server tests
  didn't access apparent_encoding so the bug stayed hidden.
- Module's class.name is now "http_services" to match the filename
  the CLI uses to register the module. Previously "common_services"
  worked in module-listing output but `-M common_services` failed.

Verified end-to-end against five in-process servers (nginx, WP,
Tomcat, HTTP Basic, SPA catch-all):
- T1 nginx: Server:nginx, title, tech identified
- T2 WP: tech=apache,php,wordpress identified
- T3 Tomcat: tech=apache,tomcat
- T4 401 baseline: shows auth:Basic
- T5 bad creds: rejected with "HTTP 401 - bad credentials"
- T6 good creds: success
- T7 form-auth bogus creds: refuses to claim success
- T8 SPA catch-all: zero false-positive matches via baseline
- T9 WP module: wp-login.php found, persisted to probes table
- T10 Tomcat module: /manager/html reported as auth-protected hint
- T13 --ssl bumps port 80 to 443
- T14 50-path module scan completes in 1.9s @ CONCURRENCY=10
@github-actions
Copy link
Copy Markdown

It looks like the PR template may not have been filled out. The following sections appear to be missing:

  • Description

  • Type of change

  • Setup guide for the review

  • Checklist

Please edit your PR description to include them. The template helps reviewers understand and test your changes. Thanks!

@cybrly cybrly closed this May 16, 2026
@cybrly cybrly deleted the claude/nxc-http-module-fUjQH branch May 16, 2026 20:20
@Marshall-Hallenbeck Marshall-Hallenbeck added the slop things that are AI slop label May 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

slop things that are AI slop

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants