Skip to content

Overlong group_ref crashes the upload path on Postgres via the btree index size limit #14

@erskingardner

Description

@erskingardner

Impact

A token-authenticated upload that contains a single line with a long group_ref string raises a production 500 and loses the raw evidence instead of being quarantined as an invalid AuditFile.

AuditEvent.group_ref is an unbounded TextField, but it is part of a composite btree index (group_ref, wall_time_ms). PostgreSQL refuses to insert an index tuple larger than ~2704 bytes (index row size N exceeds btree version 4 maximum 2704 for index ...). A group_ref longer than that limit makes the INSERT fail.

The parser does not bound the length of the stored group_ref value. normalize_event() stores the raw string verbatim (value_if_str(...)) and only appends a validation error when the value is not valid hex / is too long — it does not drop or truncate the value. Because the quarantine path still creates AuditEvent rows for invalid files, the oversized value is handed to bulk_create() and the database rejects it.

The resulting exception is a DataError/ProgrammingError, not IntegrityError, so it is not caught by the except IntegrityError handler in ingest_audit_log_bytes(). The transaction rolls back and nothing is saved — the raw upload evidence is lost and the client gets a 500.

This is closely related to #7, but it is a different root cause: #7 is about bounded CharField max_length constraints; this one is about a TextField that has no max_length at all but is still constrained by the btree index row-size limit. The fix proposed in #7 (max-length checks on bounded CharFields) would not cover group_ref.

Code pointers

  • forensics/models.py:188group_ref = models.TextField(blank=True) (no length bound).
  • forensics/models.py:262models.Index(fields=["group_ref", "wall_time_ms"]) imposes the btree row-size limit.
  • forensics/ingest.py:256normalize_event() keeps the raw group_ref string regardless of length.
  • forensics/ingest.py:628event_values() copies normalized["group_ref"] straight onto the AuditEvent.
  • forensics/ingest.py:477bulk_create(events) performs the failing INSERT.
  • forensics/ingest.py:145except IntegrityError does not catch the DataError this raises.

Reproduction (analysis)

Against a Postgres deployment, upload a single otherwise-valid JSONL event whose group_ref is a ~6000-character hex string. valid_group_ref() rejects it (it exceeds AuditGroup.group_ref max_length=512), so the file is marked invalid and an AuditEvent row is created for the quarantined line. The INSERT then fails with index row size ... exceeds btree version 4 maximum 2704, the exception escapes ingest_audit_log_bytes(), and the upload returns 500 with no AuditFile persisted.

On SQLite (local dev) there is no btree row-size limit, so this does not reproduce locally — it only bites in production, which is exactly the deployment that matters.

Expected behavior

An out-of-schema / oversized group_ref should produce a 400 and a saved quarantined AuditFile, preserving the raw log evidence — matching the app's behavior for malformed JSON and mixed-engine files.

Suggested fix

  • Bound the value actually stored in AuditEvent.group_ref (e.g. truncate to a safe, indexable length, or store the raw text in raw_line/raw_event only and keep group_ref empty when the value is invalid).
  • Alternatively, drop the value from the indexed column for invalid lines, or replace the plain btree index with a hashed/expression index.
  • Add a regression test that runs an oversized group_ref through the upload path and asserts the upload is quarantined (status invalid, raw evidence preserved) rather than raising a database error.

Metadata

Metadata

Assignees

No one assigned

    Labels

    HIGHSeverity: serious correctness, availability, or data-integrity issuebugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions