When we use std::hash<std::string> to detect content changes, we are at the mercy of invisible implementation details.
The Current Situation
Our humble std::hash:
- Detects content changes sometimes, unless it doesn’t :).
- Pretends to be stable, but changes personality between compilers.
- Has collisions. (Rare but can happen)
- Lives only for one bit of truth: “changed” or “unchanged.”
- And right now we also do not include metadata in hashing, we must include it so even when metadata changes,
then versoin of data should change too.
Why We Must Liberate the Hash
A content hash should be deterministic — the same content deserves the same hash in every universe.
We need portability and transparency, not black-box wizardry.
And thus we are in need to write our own hash, using nothing but standard C++ libraries and STL.
The Plan
We design a simple function:
std::string content_hasher(const std::string& content);
It shall:
- Combine type, content, and metadata. (right now we combine only type and content)
- Use reproducible operations (e.g. FNV-1a variant, or another free and honest algorithm ~rms).
Where to implement it
Start in utils.h. When it matures — we might give it its own home (migr_hash.h).
Guiding Principle
- No external dependencies.
- No random seeds.
- Only reproducible, human-readabel, pure Mathematics.
When we use
std::hash<std::string>to detect content changes, we are at the mercy of invisible implementation details.The Current Situation
Our humble
std::hash:then versoin of data should change too.
Why We Must Liberate the Hash
A content hash should be deterministic — the same content deserves the same hash in every universe.
We need portability and transparency, not black-box wizardry.
The Plan
We design a simple function:
It shall:
Where to implement it
Start in
utils.h. When it matures — we might give it its own home (migr_hash.h).Guiding Principle