Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,22 @@ build/
.claude/
.gstack/
.worktrees/

# Claude Code / agent artifacts (local only)
CLAUDE.md
AGENTS.md
TODOS.md
PYTHON_ISSUES.md
docs/superpowers/

# IDE / OS
.idea/
.vscode/
*.swp
*.swo
.DS_Store
Thumbs.db

# Captures (not committed)
*.pcap
*.pcapng
76 changes: 76 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# Changelog

All notable changes to ja4plus are documented here. The format is based
on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this
project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [0.6.0] - 2026-05

Major spec-compliance update against the May 2026 FoxIO JA4+ spec
(PRs #267, #270, #277, #281, #288), and a parity pass against the Go
reference implementation.

### Added

- **JA4D6** (`ja4plus.JA4D6Fingerprinter` / `generate_ja4d6`): DHCPv6
fingerprinting (10th JA4+ method). Format mirrors JA4D with DHCPv6
semantics — DUID size from option 1, IATA presence flag, Client FQDN
flag, all option types in presence order including nested options
inside IA_NA / IA_TA / IA_PD / IA Address / IA Prefix.
- **JA4D** is now a public package export
(`from ja4plus import JA4DFingerprinter, generate_ja4d`).
- **`Processor`** aggregator class (`ja4plus.Processor`) — runs every
JA4+ fingerprinter on each packet and returns a list of result dicts.
Provides `process_packet`, `reset`, `cleanup_connection`,
`get_shard_key` (sorted 5-tuple, direction-independent).
- **JA4 / JA4S raw exposure**: every result entry on these fingerprinters
now includes `raw` and `raw_original_order` keys, plus
`last_raw` / `last_raw_original_order` instance attributes for the most
recent successful parse. JSON CLI output emits these fields.
- **Multi-packet QUIC CRYPTO reassembly**: large ClientHellos that span
multiple Initial datagrams (sharing a DCID) are now reassembled. New
helpers `decrypt_quic_initial_crypto`, `parse_crypto_frames`,
`reassemble_crypto_fragments`, `client_hello_from_crypto_fragments` in
`ja4plus.utils.quic_utils`. The CRYPTO frame parser now skips ACK
frames (0x02/0x03) instead of bailing on them.
- **X.509 module helpers**: `compute_ja4x_from_pem(bytes)` and
`compute_ja4x_from_der(bytes)` mirroring Go's
`ComputeJA4XFromPEM` / `ComputeJA4XFromDER`.
- CLI `--types` accepts `ja4d` and `ja4d6`.

### Fixed

- **JA4 ALPN non-alphanumeric** (PR #277): when the first or last byte
of the first ALPN value is not ASCII alphanumeric, the JA4 ALPN
component is now the first/last character of the lowercase HEX of the
full first ALPN value. Previously ja4plus dropped non-ASCII bytes via
`decode('ascii', errors='ignore')` and emitted `"99"` on the first
byte being non-ASCII. Raw ALPN bytes are now preserved on
`tls_info["alpn_raw"]`.
- **JA4H HTTP/2 + HTTP/3 version codes** (PR #288): `HTTP/2` now maps to
`"20"` and `HTTP/3` to `"30"` in the JA4H part-A version code (not
`"2"` / `"3"`). HTTP/1.0 / HTTP/1.1 unchanged.
- **JA4H cookie-VALUES sort by NAME only** (PR #288): the cookie-values
hash component now sorts pairs explicitly by cookie name; previously
relied on tuple-sort tie-breaking.
- **JA4SSH deterministic mode tiebreak** (PR #281): when multiple packet
sizes tie for the highest frequency, the LOWEST value wins. Previously
used `Counter.most_common(1)[0][0]`, whose result could vary based on
insertion order.
- **JA4L UDP/QUIC server-first orderings**: the QUIC timing path no
longer requires the connection's lexicographic direction to be
`forward`. The first packet on the flow defines the client; subsequent
packets are routed by comparing endpoints to that anchor.
- **JA4D skip set** matches the spec exactly: `{0, 53, 50, 81}`. The
End marker (255) is handled by the parse loop and never recorded.

### Changed

- Bumped version to **0.6.0**.
- README updated to reflect 10 JA4+ methods and new APIs.

### Internal

- Per-DCID QUIC fragment buffer + reverse map for cleanup.
- New `ja4plus.utils.quic_utils._parse_alpn_with_bytes` returns both
decoded strings and raw bytes for ALPN.
40 changes: 39 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
<p align="center"><img src="assets/logo.png" width="300"></p>

A Python library and CLI for JA4+ network fingerprinting. Implements all eight JA4+ methods for identifying and classifying network traffic based on TLS, TCP, HTTP, SSH, and X.509 characteristics. Supports QUIC, IPv4/IPv6, and multi-segment TCP reassembly.
A Python library and CLI for JA4+ network fingerprinting. Implements all ten JA4+ methods for identifying and classifying network traffic based on TLS, TCP, HTTP, SSH, X.509, and DHCP characteristics. Supports QUIC, IPv4/IPv6, and multi-segment TCP reassembly.

JA4+ is a set of network fingerprinting standards created by [FoxIO](https://foxio.io). This library is an independent Python implementation of the published specification. For the original spec, see the [FoxIO JA4+ repository](https://github.com/FoxIO-LLC/ja4).

Expand All @@ -21,6 +21,8 @@ JA4+ is a set of network fingerprinting standards created by [FoxIO](https://fox
| JA4L | TCP/QUIC | Light distance and latency estimation |
| JA4X | X.509 | Certificate structure fingerprint from OID sequences |
| JA4SSH | SSH | Session type classification from traffic patterns |
| JA4D | DHCPv4 | DHCP client/server fingerprint (FoxIO PR #267/#270) |
| JA4D6 | DHCPv6 | DHCPv6 client/server fingerprint (FoxIO PR #267/#270) |

QUIC Initial packets (RFC 9001/9369) are automatically decrypted to extract TLS ClientHellos. IPv4 and IPv6 are both supported across all fingerprinters.

Expand Down Expand Up @@ -102,6 +104,8 @@ from ja4plus import (
JA4LFingerprinter, # Latency
JA4XFingerprinter, # X.509 Certificate
JA4SSHFingerprinter, # SSH
JA4DFingerprinter, # DHCPv4
JA4D6Fingerprinter, # DHCPv6
)
```

Expand All @@ -123,6 +127,38 @@ from ja4plus import generate_ja4, generate_ja4s, generate_ja4h
fingerprint = generate_ja4(packet)
```

### Aggregating Processor

Run every fingerprinter on each packet and get a list of results:

```python
from ja4plus import Processor

p = Processor()
for packet in packets:
for r in p.process_packet(packet):
print(r["type"], r["fingerprint"], r.get("raw"))

# Use get_shard_key to bucket packets per connection
shard_key = p.get_shard_key(packet)

# Cleanup state for a finished connection
p.cleanup_connection(src_ip, src_port, dst_ip, dst_port, "tcp")
```

JA4 and JA4S result dicts include the unhashed `raw` and
`raw_original_order` variants — useful for human-readable output and
fingerprint debugging.

### X.509 Helpers

```python
from ja4plus import compute_ja4x_from_pem, compute_ja4x_from_der

ja4x = compute_ja4x_from_pem(pem_bytes)
ja4x = compute_ja4x_from_der(der_bytes)
```

See [`docs/usage.md`](docs/usage.md) for detailed usage of each fingerprinter and [`docs/api_reference.md`](docs/api_reference.md) for the full API.

## Fingerprint Formats
Expand All @@ -137,6 +173,8 @@ See [`docs/usage.md`](docs/usage.md) for detailed usage of each fingerprinter an
| JA4L | `JA4L-{C\|S}={latency_us}_{ttl}` | `JA4L-S=2500_56` |
| JA4X | `{issuer}_{subject}_{extensions}` | `a37f49ba31e2_a37f49ba31e2_dd4f1a0ef8b2` |
| JA4SSH | `c{mode}s{mode}_c{pkts}s{pkts}_c{acks}s{acks}` | `c36s36_c51s80_c69s0` |
| JA4D | `{type}{size}{ip}{fqdn}_{options}_{request_list}` | `disco0000in_61-55_1-3-6-42` |
| JA4D6 | `{type}{size}{ip}{fqdn}_{options}_{request_list}` | `solct0014nn_1-6-8-25_23-24` |

## Spec Validation

Expand Down
45 changes: 44 additions & 1 deletion ja4plus/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,9 @@
from ja4plus.fingerprinters.ja4ssh import JA4SSHFingerprinter
from ja4plus.fingerprinters.ja4t import JA4TFingerprinter
from ja4plus.fingerprinters.ja4ts import JA4TSFingerprinter
from ja4plus.fingerprinters.ja4d import JA4DFingerprinter
from ja4plus.fingerprinters.ja4d6 import JA4D6Fingerprinter
from ja4plus.processor import Processor

# Function-based API
from ja4plus.fingerprinters.ja4 import generate_ja4
Expand All @@ -24,7 +27,47 @@
from ja4plus.fingerprinters.ja4ssh import generate_ja4ssh
from ja4plus.fingerprinters.ja4t import generate_ja4t
from ja4plus.fingerprinters.ja4ts import generate_ja4ts
from ja4plus.fingerprinters.ja4d import generate_ja4d
from ja4plus.fingerprinters.ja4d6 import generate_ja4d6

__version__ = "0.4.1"
def compute_ja4x_from_der(cert_der_bytes):
"""Compute the JA4X fingerprint for a DER-encoded X.509 certificate.

Args:
cert_der_bytes: bytes containing a DER-encoded certificate.

Returns:
JA4X fingerprint string, or None if the certificate could not be parsed.
"""
fp = JA4XFingerprinter()
return fp.fingerprint_certificate(cert_der_bytes)


def compute_ja4x_from_pem(cert_pem_bytes):
"""Compute the JA4X fingerprint for a PEM-encoded X.509 certificate.

Args:
cert_pem_bytes: bytes containing a PEM-encoded certificate
(one or more PEM blocks; only the first is used).

Returns:
JA4X fingerprint string, or None if the certificate could not be parsed.
"""
from cryptography import x509
from cryptography.hazmat.backends import default_backend
from cryptography.hazmat.primitives.serialization import Encoding

if isinstance(cert_pem_bytes, str):
cert_pem_bytes = cert_pem_bytes.encode("ascii")

try:
cert = x509.load_pem_x509_certificate(cert_pem_bytes, default_backend())
except Exception:
return None
der = cert.public_bytes(Encoding.DER)
return compute_ja4x_from_der(der)


__version__ = "0.6.0"
__author__ = "ja4plus contributors"
__license__ = "BSD-3-Clause"
35 changes: 30 additions & 5 deletions ja4plus/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,13 @@
from ja4plus.fingerprinters.ja4ts import JA4TSFingerprinter
from ja4plus.fingerprinters.ja4x import JA4XFingerprinter
from ja4plus.fingerprinters.ja4ssh import JA4SSHFingerprinter
from ja4plus.fingerprinters.ja4d import JA4DFingerprinter
from ja4plus.fingerprinters.ja4d6 import JA4D6Fingerprinter

VALID_TYPES = ["ja4", "ja4s", "ja4h", "ja4l", "ja4t", "ja4ts", "ja4x", "ja4ssh"]
VALID_TYPES = [
"ja4", "ja4s", "ja4h", "ja4l", "ja4t", "ja4ts", "ja4x", "ja4ssh",
"ja4d", "ja4d6",
]

ALL_FINGERPRINTERS = {
"ja4": JA4Fingerprinter,
Expand All @@ -35,6 +40,8 @@
"ja4ts": JA4TSFingerprinter,
"ja4x": JA4XFingerprinter,
"ja4ssh": JA4SSHFingerprinter,
"ja4d": JA4DFingerprinter,
"ja4d6": JA4D6Fingerprinter,
}


Expand Down Expand Up @@ -91,11 +98,21 @@ def _get_packet_source(packet):

def _output_results(results, fmt, writer=None, ja4db_client=None):
"""
Output a list of (source, type, fingerprint) tuples in the requested format.
Output a list of result tuples in the requested format.

Each result is (source, fp_type, fingerprint, raw, raw_oo) where raw and
raw_oo are optional (None for fingerprinters that don't expose them).
writer is only used for csv format (a csv.writer instance).
ja4db_client is optional JA4DBClient for fingerprint identification.
"""
for source, fp_type, fingerprint in results:
for entry in results:
# Backward compat: accept 3-tuples too
if len(entry) == 3:
source, fp_type, fingerprint = entry
raw, raw_oo = None, None
else:
source, fp_type, fingerprint, raw, raw_oo = entry

identified = ""
if ja4db_client:
match = ja4db_client.lookup(fingerprint)
Expand All @@ -104,6 +121,10 @@ def _output_results(results, fmt, writer=None, ja4db_client=None):

if fmt == "json":
obj = {"source": source, "type": fp_type, "fingerprint": fingerprint}
if raw is not None:
obj["raw"] = raw
if raw_oo is not None:
obj["raw_original_order"] = raw_oo
if ja4db_client:
obj["identified_as"] = identified or None
print(json.dumps(obj))
Expand Down Expand Up @@ -172,7 +193,9 @@ def cmd_analyze(args):
try:
result = fp.process_packet(packet)
if result:
row_batch.append((source, fp_type, result))
raw = getattr(fp, 'last_raw', None)
raw_oo = getattr(fp, 'last_raw_original_order', None)
row_batch.append((source, fp_type, result, raw, raw_oo))
except Exception:
pass
if row_batch:
Expand Down Expand Up @@ -226,7 +249,9 @@ def process_packet(packet):
try:
result = fp.process_packet(packet)
if result:
row_batch.append((source, fp_type, result))
raw = getattr(fp, 'last_raw', None)
raw_oo = getattr(fp, 'last_raw_original_order', None)
row_batch.append((source, fp_type, result, raw, raw_oo))
except Exception:
pass
if row_batch:
Expand Down
Loading
Loading