Record: First Legal Sub-1.0 BPB — Multi-order N-gram Backoff + Entropy-Adaptive Alpha (val_bpb=0.9674, 3-seed) by Asukabot0 · Pull Request #727 · openai/parameter-golf

Asukabot0 · 2026-03-25T14:58:23Z

Results (3-seed validation)

Seed	val_bpb	val_loss	Size	Quantization
1337	0.96679	1.63238	15,994,366 B	int6+zstd-16
42	0.96703	1.63278	15,996,585 B	int6+zstd-16
7	0.96825	1.63485	15,988,201 B	int6+zstd-16
Mean	0.96736	1.63334	—	—
Std	0.00063	—	—	—

Technique

Architecture: 11L, 512d, GQA 8H/4KV, MLP 3x, LeakyReLU(0.5)², XSA-all(11), Value Residual, Gated Attention, SmearGate, BigramHash(4096), Partial RoPE(16/64), LN Scale, EMA(0.997). Tied embeddings. Muon optimizer.

N-gram eval cache — two key improvements over prior work:

Multi-order backoff (orders 2–7): Instead of a single fixed order, we attempt the highest order first and cascade down on miss. This dramatically improves coverage vs a fixed 7-gram.
Entropy-adaptive alpha: alpha = 0.05 + 0.55 * sigmoid(2 * (H - 4.0)). When the model is uncertain (high entropy), we trust n-gram statistics more; when confident (low entropy), we trust the LM. This replaces the fixed alpha=0.40 used in prior approaches.

Compliance

Score-first, backward-looking: n-gram counts are built from previously scored tokens only
No oracle selection: alpha depends solely on the model's own output distribution (entropy), never on ground-truth labels
No cross-GPU sync: each GPU maintains its own independent cache

Ablation

Configuration	val_bpb	Delta
No n-gram (neural only)	1.1271	baseline
Fixed alpha=0.40, order=7, no backoff	1.0336	−0.0935
Multi-order backoff (2-7) + fixed alpha=0.40	0.9825	−0.1446
Multi-order backoff (2-7) + entropy-adaptive	0.9674	−0.1597

Comparison with prior submissions

Submission	val_bpb	Delta vs this
PR #549 SOTA	1.1194	−0.152
PR #702 (n-gram backoff)	1.0240	−0.057
This PR	0.9674	—

Training

8× H100 SXM (RunPod), 600s wallclock, ~5580 steps per seed
No TTT, no SWA, no canonical attention
int6 per-row + zstd-16 quantization (no int5 fallback needed)

Built on modded-nanogpt. Credits: PR #315, #609, #493, #518, #413, #674, #702.

… 3-seed) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

deanbrr · 2026-03-25T15:17:51Z

Congrats Asukbot0, nice work.

I believe I was the first to contribute the N-gram eval cache technique PR #659 to the contest

@deanbrr

Multi-order backoff (2-7) + entropy-adaptive alpha on 11L/512d U-Net. All 3 seeds sub-1.0. GPTQ calibration inside training phase. Seeds: 42=0.9631, 2045=0.9620, 7=0.9624, mean=0.9625 Credits: @deanbrr openai#659, @Asukabot0 openai#727, @signalrush openai#414 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Record: Multi-order backoff + entropy-adaptive alpha (val_bpb=0.9674,…

091afd7

… 3-seed) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Asukabot0 changed the title ~~Record: Multi-order N-gram Backoff + Entropy-Adaptive Alpha (val_bpb=0.9674)~~ Record: First Legal Sub-1.0 BPB — Multi-order N-gram Backoff + Entropy-Adaptive Alpha (val_bpb=0.9674, 3-seed) Mar 25, 2026

notapplica mentioned this pull request Mar 25, 2026

Parameter Golf Live AI Commentary + Analysis / Ideas | every 10 minutes #140

Open

newjordan mentioned this pull request Mar 25, 2026

Podracing II: Electric Bugaloo — 0.9625 BPB (3-seed mean, all sub-0.964) #753

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Record: First Legal Sub-1.0 BPB — Multi-order N-gram Backoff + Entropy-Adaptive Alpha (val_bpb=0.9674, 3-seed)#727

Record: First Legal Sub-1.0 BPB — Multi-order N-gram Backoff + Entropy-Adaptive Alpha (val_bpb=0.9674, 3-seed)#727
Asukabot0 wants to merge 1 commit intoopenai:mainfrom
Asukabot0:submission/backoff-entropy-0.9674

Asukabot0 commented Mar 25, 2026

Uh oh!

deanbrr commented Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Asukabot0 commented Mar 25, 2026

Results (3-seed validation)

Technique

Compliance

Ablation

Comparison with prior submissions

Training

Uh oh!

deanbrr commented Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants