Skip to content

Conversation

@liquidraver
Copy link
Contributor

@liquidraver liquidraver commented Jan 22, 2026

My pet project outgrew itself, so I thought I share it here, maybe someone will have use for the information and stuff I learned along the way in the future.

MeshCore v2 Encryption: ChaCha20-Poly1305 Implementation

Requirements

This implementation requires repeaters to forward PAYLOAD_VER_2 packets! (currently in dev everything is dropped above VER_1)
Refactored so it will use PAYLOAD_VER_1

Decisions

I decided to use ChaCha20-Poly1305 because we are using software for decrypt anyway (ESP's can do hardware accelerated stuff, but I tried to be universal, chacha is 10x faster than the current AES-ECB+HMAC anyway, it was designed for "low-end" systems like our embedded variants)
I decided to use 12 byte nonce and 12 byte tag because it's the lowest that adheres to standards. (Nonce can be lowered to 8 to save some minor airtime)

Draft PR, if anyone have anything to teach me more I'd be glad to hear it.

TL;DR

ChaCha20-Poly1305 authenticated encryption, replacing AES-128-ECB scheme for enhanced security.
The implementation maintains backward compatibility with v1 packets while providing stronger cryptographic guarantees for "v2-capable" nodes, as minimal airtime increase as I could squeeze out, no handshakes no nothing "additional" packets.

This was a huge project, I worked on it in "one section at a time" to preserve context windows and try to understand everything I did along the way.
I'm not a crypto expert nor a coding guru, so read everything with this in mind.

I've tested everything I could on my daily driver companion and my repeater (the code is still running on them, I'm using our mesh with this right now, of course v2 encryptes talk just between my repeater's admin commands and my other test companion).

And for the masochist people, the full changelog, I "enchanced" my scribbles with claude:

What Changed

Core Cryptographic Changes

  1. New Encryption Algorithm: ChaCha20-Poly1305 AEAD cipher

    • Key size: 32 bytes (256-bit)
    • Nonce size: 12 bytes (96-bit)
    • Authentication tag: 12 bytes (96-bit, truncated from 16-byte Poly1305 output)
    • AAD (Additional Authenticated Data): Includes packet metadata (hashes, type, version) for integrity protection
  2. Payload Version System

    • PAYLOAD_VER_1: Legacy AES-128-ECB (unchanged, remains default)
    • PAYLOAD_VER_2: New ChaCha20-Poly1305 encryption
    • Version negotiation via advertisement flags (ADV_FEAT1_CHACHA_CAPABLE)
  3. Channel Tagging

    • Added CHANNEL_FLAG_V2 to GroupChannel.flags byte
    • Channels can be marked as v2-capable via companion app commands
    • Default channel creation remains v1 for backward compatibility
  4. Packet Types Migrated to v2

    • PAYLOAD_TYPE_REQ (request packets)
    • PAYLOAD_TYPE_RESPONSE (response packets)
    • PAYLOAD_TYPE_TXT_MSG (text messages)
    • PAYLOAD_TYPE_GRP_TXT (group text messages)
    • PAYLOAD_TYPE_GRP_DATA (group data messages)
    • PAYLOAD_TYPE_PATH (path return packets)
    • PAYLOAD_TYPE_ANON_REQ (anonymous request packets)
  5. Forwarding Logic Updated

    • Repeaters now forward PAYLOAD_VER_2 packets (previously dropped)
    • Old firmware will drop v2 packets (expected behavior)

Security Hardening (Additional Fixes)

This PR also includes critical security hardening fixes identified during code audit:

  1. Packet Parsing Hardening (Packet::readFrom())

    • Changed length parameter from uint8_t to uint16_t to prevent truncation
    • Added comprehensive bounds checks before reading header, transport codes, path, and payload
    • Prevents out-of-bounds reads on malformed packets
  2. ACK Packet Parsing

    • Added payload_len >= 4 validation before reading ack_crc
    • Applied to both flood and direct route ACK handlers
    • Prevents buffer overreads on truncated ACK packets
  3. TRACE Packet Parsing

    • Added minimum length check (payload_len >= 9) before reading trace fields
    • Added bounds validation before hash match operations
    • Prevents out-of-bounds access during path tracing
  4. Bridge Compatibility

    • RS232Bridge and ESPNowBridge now compatible with uint16_t length parameter
    • No truncation issues when handling 256-byte packets

Design Decisions in full

Why ChaCha20-Poly1305?

  1. Embedded Systems Optimized: ChaCha20 was specifically designed for software implementations on constrained devices, making it ideal for our embedded mesh nodes
  2. No Hardware Dependency: Unlike AES-GCM, ChaCha20-Poly1305 performs well in pure software, ensuring consistent performance across all platforms (ESP32, nRF52, RP2040, STM32)
  3. Side-Channel Resistance: ChaCha20's operations are naturally more resistant to timing side-channels compared to software AES implementations
  4. Proven Security: ChaCha20-Poly1305 is standardized (RFC 8439) and widely deployed (TLS 1.3, WireGuard)

Why 12-Byte (96-bit) Authentication Tag?

UPDATE: 8 byte tag is enough

  1. Security vs Airtime Balance: 96-bit tags provide 2^-96 forgery probability (cryptographically secure) while minimizing airtime overhead
  2. Industry Standard: 96-bit tags are standard and widely accepted as secure
  3. Airtime Optimization: 12 bytes vs 16 bytes saves 4 bytes per packet, reducing transmission time on LoRa
  4. Library Support: The rweather/Crypto library supports tag truncation via CHACHA_TAG_SIZE

Why 12-Byte Nonce?

UPDATE: Compressing nonce to 4 byte via SHA derivation

  1. Standard Size: 12-byte (96-bit) nonces are the standard for ChaCha20-Poly1305 (RFC 8439)
  2. Security: Provides 2^96 unique nonces, sufficient for the lifetime of any key
  3. Library Compatibility: Matches rweather/Crypto library expectations
  4. Nonce Generation: Uses hybrid strategy (random boot ID + counter + random salt) via Utils::getHardwareRandom()

AAD (Additional Authenticated Data) Implementation

AAD protects packet metadata from tampering:

  • What's included: dest_hash, src_hash (or channel_hash), payload_type, payload_version
  • Security benefit: Prevents attackers from modifying routing information or packet type without detection
  • Airtime impact: NONE - AAD is authenticated but not transmitted (computed from existing packet fields)
  • Compatibility: No breaking changes - AAD is computed from immutable packet header fields

How It Works

Encryption Flow (v2)

  1. Packet Creation: Node checks if peer/channel supports v2 (via peerSupportsCHACHA() or CHANNEL_FLAG_V2)
  2. Nonce Generation: Creates 12-byte nonce using Utils::getHardwareRandom() (platform-specific RNG)
  3. AAD Construction: Builds AAD from packet metadata (hashes, type, version)
  4. Encryption: Calls Utils::encryptCHACHA() which:
    • Sets ChaCha20 key and nonce
    • Adds AAD for authentication
    • Encrypts plaintext
    • Computes 12-byte Poly1305 tag
    • Returns: [nonce (12)] [ciphertext] [tag (12)]
  5. Packet Assembly: Prepends routing hashes and sets PAYLOAD_VER_2 in header

Decryption Flow (v2)

  1. Version Detection: Reads PAYLOAD_VER_2 from packet header
  2. AAD Reconstruction: Rebuilds AAD from packet metadata (same fields as encryption)
  3. Decryption: Calls Utils::decryptCHACHA() which:
    • Extracts nonce, ciphertext, and tag
    • Sets ChaCha20 key and nonce
    • Adds AAD
    • Decrypts ciphertext
    • Verifies tag using constant-time comparison (secure_compare())
    • Returns plaintext length on success, 0 on authentication failure
  4. Validation: If tag check fails, packet is silently dropped

Capability Negotiation

  • Advertisement: Nodes advertise v2 support via ADV_FEAT1_CHACHA_CAPABLE flag in PAYLOAD_TYPE_ADVERT packets
  • Peer Tracking: ContactInfo.supports_chacha and ClientInfo.supports_chacha track peer capabilities
  • Automatic Selection: Nodes automatically use v2 when both sender and receiver support it
  • Graceful Degradation: Falls back to v1 if peer doesn't advertise v2 support

Airtime Comparison (sorry if the table is shifted)

Packet Size Overhead

Message Length v1 (ECB) v2 (ChaCha, 12-byte tag) Difference
5 bytes 20 bytes 31 bytes +11 bytes
10 bytes 20 bytes 36 bytes +16 bytes
20 bytes 36 bytes 46 bytes +10 bytes
50 bytes 68 bytes 76 bytes +8 bytes
100 bytes 116 bytes 126 bytes +10 bytes

Airtime (BW62.5, SF8, CR8)

Message Length v1 Airtime v2 Airtime Increase
5 bytes 181.25 ms 230.40 ms +49.15 ms
10 bytes 181.25 ms 246.78 ms +65.54 ms
20 bytes 246.78 ms 279.55 ms +32.77 ms
50 bytes 377.86 ms 410.62 ms +32.77 ms
100 bytes 574.46 ms 607.23 ms +32.77 ms

Security Improvements

What v2 Provides

  1. Authenticated Encryption: Confidentiality + integrity in one operation
  2. Tamper Detection: AAD protects routing metadata from modification
  3. Constant-Time Verification: Tag comparison uses secure_compare() to prevent timing attacks
  4. Stronger Authentication: 96-bit tags vs 16-bit HMAC (2^96 vs 2^16 security level)
  5. Replay Protection: Same as v1 (hash-based duplicate detection), but stronger authentication makes replay detection more reliable

What v2 Doesn't Change

  1. Replay Protection: Still relies on in-memory hash cache (resets on reboot)
  2. ACK Encryption: PAYLOAD_TYPE_ACK packets remain unencrypted (by design, for low overhead)
  3. Key Management: Same shared secret derivation as v1
  4. Forwarding Behavior: Repeaters still forward packets (now including v2)

Backward Compatibility

  • v1 Packets: Continue to work unchanged
    - Old Firmware: Will drop v2 packets (expected)
  • Mixed Networks: v1 and v2 nodes can coexist (v1 nodes ignore v2 packets)
  • Channel Defaults: New channels default to v1 (must be explicitly upgraded to v2)

Resource Usage (CPU/RAM)

RAM Impact

  • Stack per operation: +200 bytes (temporary, ~350 bytes total vs ~146 bytes for v1)
  • Heap: No change (0 bytes - all stack-allocated)
  • Persistent RAM: No change (0 bytes)

CPU Impact

  • Performance: v2 is 10x faster than v1 (ChaCha20: ~3-4 cycles/byte vs AES-ECB: ~20-30 cycles/byte)
  • Busy repeater scenario (50 packets/sec): ~0.094% CPU usage (v2) vs ~0.94% CPU usage (v1)

Known Limitations / Future Work

  1. V2 channels' messages can only seen by V2 capable nodes

Breaking Changes

- Old Repeaters: Will drop PAYLOAD_VER_2 packets

@liquidraver
Copy link
Contributor Author

liquidraver commented Jan 25, 2026

Everybody loves numbers (SF8, BW62.5, CR8) (edit: corrected the math, initially I did not count the timestamp)

Real Message Examples

Note: Plaintext = message length + 4 bytes (timestamp prefix in TXT_MSG)

Message Plaintext V1 Encrypted V2 Encrypted Diff (bytes) Diff (%)
"k" (1) 5 18 17 -1 -5.6%
"omw" (3) 7 18 19 +1 +5.6%
"ping" (4) 8 18 20 +2 +11.1%
"ranecko" (7) 11 18 23 +5 +27.8%
"hello world" (11) 15 18 27 +9 +50.0%
"hello world!" (12) 16 18 28 +10 +55.6%
"hello world!!" (13) 17 34 29 -5 -14.7%
"Vienna is in the clouds" (24) 28 34 40 +6 +17.6%
"Vienna is in the clouds!" (25) 29 34 41 +7 +20.6%
"you have to go to #ping for this" (33) 37 50 49 -1 -2.0%
"The quick brown fox jumps over the lorem ipsum" (46) 50 66 62 -4 -6.1%
"Hi guys, there is a new channel #portable for spots, mainly ham radio portable activities" (89) 93 98 105 +7 +7.1%

Overhead Calculation

V1 (AES-ECB + HMAC):

  • MAC: 2 bytes
  • Ciphertext: ceil(plaintext / 16) × 16
  • Overhead: 2 + (16 - (plaintext % 16)) % 16 → ranges 2-17 bytes

V2 (ChaCha20-Poly1305):

  • Counter: 4 bytes + Tag: 8 bytes = 12 bytes fixed
  • No padding needed

When Each Wins

  • V2 smaller: plaintext % 16 in range 13-15 or 0 (just past block boundary, V1 wastes padding)
  • V1 smaller: plaintext % 16 in range 1-12 (V1 padding < 10 bytes)
  • Equal: plaintext % 16 = 0 and plaintext fits one more block worth

The crossover point: V2 wins when (16 - (plaintext % 16)) % 16 > 10

Security Comparison

Property V1 V2
Cipher AES-128-ECB ChaCha20
Authentication HMAC-SHA256 (truncated) Poly1305
Auth Tag Size 16 bits 64 bits
Forgery Resistance 1 in 65,536 1 in 18 quintillion
Mode ECB (no diffusion) Stream cipher + AEAD
Nonce Handling None Counter-derived
AAD Support No Yes

Summary

  • V2 overhead is predictable (always +12 bytes)
  • V1 overhead varies +2 to +17 bytes based on plaintext alignment
  • V2 is larger for most typical chat messages (short messages under ~12 chars after timestamp)
  • V2 wins for messages landing just past AES block boundaries (13-16, 29-32, 45-48 chars, etc.)
  • V2 provides 281 trillion times stronger forgery resistance (2^64 vs 2^16)
  • The security upgrade is the primary benefit; airtime savings are situational

@liquidraver
Copy link
Contributor Author

Store now - decrypt later (post-quantum stuff) notes:

To be safe:

  1. Meet in person / exchange keys via other channels / QR code
  2. Pre-add each other as contacts (both sides have public keys)
  3. Communicate only via TXT_MSG / REQ / RESPONSE
  4. Never advertise, never use ANON_REQ (eg. repeater login)

@liquidraver
Copy link
Contributor Author

Note on Ascon:

Ascon requires a full 128-bit nonce - no compression like with ChaCha. Even with a truncated 64-bit tag, Ascon overhead is 24 bytes vs 12 bytes.
Airtime penalty is bigger.

(CPU stats, but they doesn't really matter, chacha is 10x faster than AES-ECB already)
ARM: Ascon is way faster than ChaCha20-Poly1305 (~60%)
ESP32: Ascon is way slower because ESP32 lacks native word rotation instructions that Ascon relies on (~30-40%)

conclusion: chacha is better, more homogenous, battle-tested

@liquidraver
Copy link
Contributor Author

Some tests

bootlog_2026-01-26_13-09-55.log

@jbrazio
Copy link
Contributor

jbrazio commented Jan 27, 2026

With the proper algo we can get >25% average payload compression, but due to the padding required, the air time savings are negligible with v1.

Does this version still requires boundary padding ?

V2 must include [at least] 2 byte ids to stop the collision problem.

@liquidraver
Copy link
Contributor Author

liquidraver commented Jan 27, 2026

With the proper algo we can get >25% average payload compression, but due to the padding required, the air time savings are negligible with v1.

Does this version still requires boundary padding ?

V2 must include [at least] 2 byte ids to stop the collision problem.

no padding needed and I've compressed everything I can in the crypto layer maintaining acceptable security. No LZW, I focused on sec enhancements only
I'll check for 2 byte ID's and LZW later if needed
UPDATE: LZW won't work on short messages, Unishox2 would be a good candidate considering emojis and UTF-8 but where we gain ~30% we might lose ~20% on others (mixed emoji/chinese/etc characters)

I've tried to be backwards compatible so it can roll out without real breaking changes, but if the 2byte is needed then the backwards compatibility can be scrubbed from the code

the only requirement for this to work right now is for repeaters to forward payload v2. we are now dropping it in src/Mesh.cpp:
(DispatcherAction Mesh::onRecvPacket(Packet* pkt) {
if (pkt->getPayloadVer() > PAYLOAD_VER_1) { // not supported in this firmware version)

@liquidraver
Copy link
Contributor Author

liquidraver commented Jan 28, 2026

  • refactored so it will use V1 payload

  • completely backwards compatible, V1 nodes just drop V2 packets (if they ever get one, they shouldn't because V2 nodes are aware (from adverts) who they can send V2 packets)

  • V2 nodes will:

    • try to send/decrypt V2 first if they know other party is V2 capable (from advert)
    • fall back to V1 if V2 decrypt failed
    • Try V1 before V2 decrypt if the node is "unknown status" (eg. no advert yet, so we don1t know if V2 capable or not) to save some CPU
    • The only trap is if someone upgrade to V2, send advert, downgrade to V1 and doesn't advert. Then we will think it's V2 and will try communicating with chacha until we delete the old advert or V1 adverts again
  • Channels need to be explicitly flagged for V2 traffic for V2 nodes to use chacha in channel flood. V2 channels will be silent for V1 nodes.

force pushing repeater "first login chacha sensing" too. now repeaters will respond with chacha if chacha decrypt succeeded. All paths are secured. No downgrade possible between two V2-capable nodes.
The only scenario where AES is used between two V2 nodes: if somehow neither has received the other's advert (eg the two nodes are saved by hand in the app.) but that's out of FW capabilities, need to modify the app to be able to mark V2 nodes on manual adding. "supports_chacha" field in "ContactInfo" marks peers as "chacha capable".

@liquidraver
Copy link
Contributor Author

liquidraver commented Jan 28, 2026

Unused byte at offset 287 in the contacts file format is used to retain ChaCha capability of a client (after reboot, etc)
CMD_SET_CHANNEL/GET_CHANNEL got an extra byte marking channel for ChaCha

currently the channel sync is not compatible with the MC app, app does not ignore the trailing byte :) commented it out, need to think it over

@liquidraver liquidraver force-pushed the cryptov2 branch 3 times, most recently from e65abb8 to 321e5ed Compare January 28, 2026 20:12
@liquidraver
Copy link
Contributor Author

liquidraver commented Jan 28, 2026

CMD_GET_CHANNEL --> Returns 50 bytes (no flags) - for backwards compatibility
CMD_GET_CHANNEL_V2 --> Returns 51 bytes (with flags) - new way
CMD_SET_CHANNEL Accepts --> 50 or 51 bytes (optional flags) - both

RESP_CODE_CHANNEL_INFO --> 50 bytes
RESP_CODE_CHANNEL_INFO_V2 --> 51 bytes

@liquidraver
Copy link
Contributor Author

room server note:
room server acts as a crypto bridge, it speaks whatever encryption each client supports. Messages are decrypted on receipt, stored in plaintext, and re-encrypted per-recipient on sync.

thought about data at rest encryption, but... private key is on the device itself so doesn't needed, physical access to a device always game over. protect your private keys guys :)

@jimdigriz
Copy link

jimdigriz commented Jan 29, 2026

UPDATE: Compressing nonce to 4 byte via SHA derivation

Just use a plain counter as per the RFC7539, section 2.6. After a reboot, just regenerate a new key.

'Compress'ing stuff through a function to generate a nonce is not going to improve security, it actually might make it worse.

ChaCha20 requires you guarantee uniqueness in your nonces for a given key. A plain counter provides this guarantee.

On that note, I suspect you may be able to go down to a 16 bit nonce counter but you have to guarantee that you use a new key when it the counter wraps. The nonce is now just an indicator of when a key exchange must take place; you of course do the exchange pro-actively before you get close to the wrap around.

With the proper algo we can get >25% average payload compression

Dragons may lurk here, combining encryption and compression is a box of spiders.

OpenSSH looks to have mitigated the issue by delaying compression until after authentication completes whilst TLS 1.3 just removed the footgun altogether.

- Transmit only 4-byte counter [boot_id(2)][sequence(2)]
- Derive full 12-byte nonce internally: SHA256(key || counter)[0:12]
- Combined with tag reduction: V2 overhead now 12 bytes (was 24)
- Saves ~16.5ms airtime per packet at SF8/BW62.5/CR8
- Security maintained: nonce unpredictable without shared secret
repeater first login chacha sensing
contact marking persistence
channel sync V2
@liquidraver
Copy link
Contributor Author

started to rewrite a lot of thing after consulting with @nextgens
this is raw, need more time to understand everything and do a cleanup if needed

src/Utils.cpp Outdated
Comment on lines 354 to 358
// Counter format: [epoch (2 bytes random)] [sequence (2 bytes incrementing)]
// - epoch: random value, regenerated on boot or sequence wraparound
// - sequence: 0-65535, increments each message
// This guarantees uniqueness across devices (different epochs) and
// within long-running sessions (epoch changes on wraparound)
Copy link

@jimdigriz jimdigriz Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the random number?

Is it not possible to negotiate a new key?

If not, you could store the counter every N messages (say 100 or more) and after a reboot, read back the saved counter and increment it by N+X (where X>=1) or even 2*N?

Some decision will have to be made about what to do once you wrap the counter, though sending ~4bn messages to someone does seem unlikely :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's "raw" meaning I just started to transform the old chacha stuff :) thanks for pointing that out!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

negotiation won't happen, airtime is a huge factor

Copy link

@jimdigriz jimdigriz Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...airtime once per 2^{16 or 32} packets is too much of an overhead?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the random number?
The point is that you can't persist stuff reliably. What is proposed is a random boot_id (lifetime is the uptime or wrap-around) combined with a simple counter.

It doesn't depend on state keeping past uptime and the added benefit is that we can do stats on the boot_ids and identify any problem in the RNG at network's scale.

The counter is shared amongst all invocations: it's not per-peer/key

Is it not possible to negotiate a new key?

"negotiate" no, the aim of the PR is to make something backward compatible that keeps the existing packet format/structure. Fancier stuff (ECDH, ...) will require more work. What's proposed in this PR won't have forward secrecy.

That said the plan is to diversify keys (if only because existing keys are already reused elsewhere in the codebase) as that does not cost anything (we're not reusing key schedules) and solves the counter-wrapping problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants