-
Notifications
You must be signed in to change notification settings - Fork 26
Encrypted Weights
Ship ML models to a browser without giving away the IP. AES-256-GCM
encryption at the file level, key delivery via your backend, WebCrypto
decrypt at session start, inference in onnxruntime-web from the
decrypted bytes.
- ✅ A casual visitor sees
.encfiles in DevTools and gets ciphertext - ✅ A scraper without your key gets nothing usable
- ✅ Different keys per customer / per session for licensing control
⚠️ A determined reverse engineer can dump the decrypted bytes from WASM memory after they extract your key. This is unavoidable for any client-side ML; the goal is raising the bar, not absolute secrecy
.enc is just AES-256-GCM ciphertext:
[ 12 bytes IV ][ N bytes ciphertext ][ 16 bytes auth tag ]
WebCrypto's AES-GCM mode handles ciphertext + tag together when you
pass (data || tag) as the input. Standard, interoperable, fast (uses
AES-NI on x86 browsers, hardware on iOS/Android).
# wasm/tools/encrypt_models.py
import secrets
from pathlib import Path
from cryptography.hazmat.primitives.ciphers.aead import AESGCM
key = secrets.token_bytes(32) # 256-bit key
Path('.model_key').write_text(key.hex()) # gitignored; ship via your backend
for name in ['facex_detect.onnx', 'facex_tiny.onnx', ...]:
plain = Path(name).read_bytes()
iv = secrets.token_bytes(12)
ct = AESGCM(key).encrypt(iv, plain, None)
Path(name.replace('.onnx', '.enc')).write_bytes(iv + ct)async function loadEncryptedModel(url, key) {
const buf = new Uint8Array(await (await fetch(url)).arrayBuffer());
const iv = buf.subarray(0, 12);
const ct = buf.subarray(12);
const k = await crypto.subtle.importKey(
'raw', key, { name: 'AES-GCM' }, false, ['decrypt']);
const onnx = new Uint8Array(
await crypto.subtle.decrypt({ name: 'AES-GCM', iv }, k, ct));
const sess = await ort.InferenceSession.create(onnx, {
executionProviders: ['wasm']
});
onnx.fill(0); // wipe plaintext from JS heap
return sess;
}After InferenceSession.create() the model lives inside the WASM heap.
Zeroing the JS-side buffer limits the window where it's accessible
from outside WASM.
For a public demo: hardcoded in JS (split + XOR-obfuscated to slow down trivial extraction). Anyone determined gets it in ~10 minutes.
For production: fetch it from your backend on session start.
import express from 'express';
import { authenticate, getCustomerKey } from './your-auth.js';
const app = express();
app.get('/api/model-key', authenticate, async (req, res) => {
const key = await getCustomerKey(req.user.id); // 32 bytes hex
res.set({
'Cache-Control': 'no-store',
'Content-Type': 'application/octet-stream',
});
res.send(Buffer.from(key, 'hex'));
});
app.listen(3000);Browser:
const keyBuf = await fetch('/api/model-key', { credentials: 'include' })
.then(r => r.arrayBuffer());
const sess = await loadEncryptedModel('facex_detect.enc', new Uint8Array(keyBuf));from fastapi import FastAPI, Depends, Response
from your_auth import authenticate, get_customer_key
app = FastAPI()
@app.get('/api/model-key')
async def model_key(user = Depends(authenticate)):
return Response(
content=bytes.fromhex(await get_customer_key(user.id)),
media_type='application/octet-stream',
headers={'Cache-Control': 'no-store'},
)If you license the engine per company, give each one a unique key. You encrypt the model bytes once per key (or, smarter, encrypt the model once with a master key and use AES-256-KEYWRAP to wrap the master key under each customer key — then you just rotate the customer wrappers).
Suspended a customer? Stop serving their key. Their existing browser sessions die after the next reload.
Sign the response with HMAC over (key || origin) and verify it
client-side. Doesn't prevent extraction, but it does prevent simple
proxying of your endpoint from a different domain.
| Attack | Stopped? |
|---|---|
curl https://your.app/facex_xs.enc |
✅ ciphertext only |
| DevTools → Network tab snoop | ✅ ciphertext only |
| Bot scraping your repo | ✅ no plaintext on disk |
| Key extraction by determined attacker | ❌ — they can dump from WASM heap |
| Per-customer revocation | ✅ — stop serving the key |
For higher security: move inference to a server you control. You lose the "private by construction" story but get full IP protection.