Skip to content

[upgrade] Ditch std::hash and use our own cryptographic hashing #10

@Aliqyan-21

Description

@Aliqyan-21

When we use std::hash<std::string> to detect content changes, we are at the mercy of invisible implementation details.

The Current Situation

Our humble std::hash:

  • Detects content changes sometimes, unless it doesn’t :).
  • Pretends to be stable, but changes personality between compilers.
  • Has collisions. (Rare but can happen)
  • Lives only for one bit of truth: “changed” or “unchanged.”
  • And right now we also do not include metadata in hashing, we must include it so even when metadata changes,
    then versoin of data should change too.

Why We Must Liberate the Hash

A content hash should be deterministic — the same content deserves the same hash in every universe.
We need portability and transparency, not black-box wizardry.

And thus we are in need to write our own hash, using nothing but standard C++ libraries and STL.

The Plan

We design a simple function:

std::string content_hasher(const std::string& content);

It shall:

  • Combine type, content, and metadata. (right now we combine only type and content)
  • Use reproducible operations (e.g. FNV-1a variant, or another free and honest algorithm ~rms).

Where to implement it

Start in utils.h. When it matures — we might give it its own home (migr_hash.h).

Guiding Principle

  • No external dependencies.
  • No random seeds.
  • Only reproducible, human-readabel, pure Mathematics.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions