Encrypted Weights

Ship ML models to a browser without giving away the IP. AES-256-GCM encryption at the file level, key delivery via your backend, WebCrypto decrypt at session start, inference in onnxruntime-web from the decrypted bytes.

The threat model

✅ A casual visitor sees .enc files in DevTools and gets ciphertext
✅ A scraper without your key gets nothing usable
✅ Different keys per customer / per session for licensing control
⚠️ A determined reverse engineer can dump the decrypted bytes from WASM memory after they extract your key. This is unavoidable for any client-side ML; the goal is raising the bar, not absolute secrecy

File format

.enc is just AES-256-GCM ciphertext:

[ 12 bytes IV ][ N bytes ciphertext ][ 16 bytes auth tag ]

WebCrypto's AES-GCM mode handles ciphertext + tag together when you pass (data || tag) as the input. Standard, interoperable, fast (uses AES-NI on x86 browsers, hardware on iOS/Android).

Encrypt (build-time, Python)

# wasm/tools/encrypt_models.py
import secrets
from pathlib import Path
from cryptography.hazmat.primitives.ciphers.aead import AESGCM

key = secrets.token_bytes(32)            # 256-bit key
Path('.model_key').write_text(key.hex())  # gitignored; ship via your backend

for name in ['facex_detect.onnx', 'facex_tiny.onnx', ...]:
    plain = Path(name).read_bytes()
    iv = secrets.token_bytes(12)
    ct = AESGCM(key).encrypt(iv, plain, None)
    Path(name.replace('.onnx', '.enc')).write_bytes(iv + ct)

Decrypt (browser, WebCrypto)

async function loadEncryptedModel(url, key) {
  const buf = new Uint8Array(await (await fetch(url)).arrayBuffer());
  const iv = buf.subarray(0, 12);
  const ct = buf.subarray(12);
  const k = await crypto.subtle.importKey(
    'raw', key, { name: 'AES-GCM' }, false, ['decrypt']);
  const onnx = new Uint8Array(
    await crypto.subtle.decrypt({ name: 'AES-GCM', iv }, k, ct));
  const sess = await ort.InferenceSession.create(onnx, {
    executionProviders: ['wasm']
  });
  onnx.fill(0);   // wipe plaintext from JS heap
  return sess;
}

After InferenceSession.create() the model lives inside the WASM heap. Zeroing the JS-side buffer limits the window where it's accessible from outside WASM.

Where the key comes from

For a public demo: hardcoded in JS (split + XOR-obfuscated to slow down trivial extraction). Anyone determined gets it in ~10 minutes.

For production: fetch it from your backend on session start.

Express.js example

import express from 'express';
import { authenticate, getCustomerKey } from './your-auth.js';

const app = express();

app.get('/api/model-key', authenticate, async (req, res) => {
  const key = await getCustomerKey(req.user.id);   // 32 bytes hex
  res.set({
    'Cache-Control': 'no-store',
    'Content-Type': 'application/octet-stream',
  });
  res.send(Buffer.from(key, 'hex'));
});

app.listen(3000);

Browser:

const keyBuf = await fetch('/api/model-key', { credentials: 'include' })
                       .then(r => r.arrayBuffer());
const sess = await loadEncryptedModel('facex_detect.enc', new Uint8Array(keyBuf));

FastAPI example

from fastapi import FastAPI, Depends, Response
from your_auth import authenticate, get_customer_key

app = FastAPI()

@app.get('/api/model-key')
async def model_key(user = Depends(authenticate)):
    return Response(
        content=bytes.fromhex(await get_customer_key(user.id)),
        media_type='application/octet-stream',
        headers={'Cache-Control': 'no-store'},
    )

Per-customer keys

If you license the engine per company, give each one a unique key. You encrypt the model bytes once per key (or, smarter, encrypt the model once with a master key and use AES-256-KEYWRAP to wrap the master key under each customer key — then you just rotate the customer wrappers).

Suspended a customer? Stop serving their key. Their existing browser sessions die after the next reload.

Domain binding (cheap extra step)

Sign the response with HMAC over (key || origin) and verify it client-side. Doesn't prevent extraction, but it does prevent simple proxying of your endpoint from a different domain.

What this stops vs doesn't

Attack	Stopped?
`curl https://your.app/facex_xs.enc`	✅ ciphertext only
DevTools → Network tab snoop	✅ ciphertext only
Bot scraping your repo	✅ no plaintext on disk
Key extraction by determined attacker	❌ — they can dump from WASM heap
Per-customer revocation	✅ — stop serving the key

For higher security: move inference to a server you control. You lose the "private by construction" story but get full IP protection.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Encrypted Weights

Encrypted Weights

The threat model

File format

Encrypt (build-time, Python)

Decrypt (browser, WebCrypto)

Where the key comes from

Express.js example

FastAPI example

Per-customer keys

Domain binding (cheap extra step)

What this stops vs doesn't

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally