Podracing: 1.0461 BPB (3-seed mean) — 5-gram eval + LeakyReLU² by newjordan · Pull Request #706 · openai/parameter-golf

newjordan · 2026-03-25T12:39:40Z

Results

Seed	Sliding BPB	5-gram BPB	Artifact
1337	1.1190	1.0451	15.63 MB
42	1.1217	1.0471	15.59 MB
2045	1.1200	1.0460	15.64 MB
Mean	1.1202	1.0461	—

Progression

PR	Mean BPB	Notes
#190	—	The Stinky Frost Recipe
#390, #401	1.1295, 1.1243	Sponge Bath TTT + EMA/SWA/QAT
#445	1.1236	Late Training Replay + EMA + GPTQ-lite
#498, #499	1.1478	The Frugendorff (recursive weight sharing)
#508, #578	1.1215	GPTQ + Early QAT + Legal TTT
#533, #577	1.1207	GPTQ + Short TTT
#587	1.1208	XSA + quantization tuning
#656	1.1195	Three Breadsticks (activation + eval)
This PR	1.0461	Podracing (5-gram eval interpolation)

5-gram Eval (score-first, legal)

Fixed-weight hashed n-gram interpolation during sliding window eval. Concept credited to @deanbrr (PR #659).

Cache built from already-scored tokens only (backward-looking)
Fixed alpha=0.20: always p_final = 0.80 * p_model + 0.20 * p_ngram
No safety gate, no target-aware selection, no min-NLL comparison
Hashed count-min sketch (4M buckets), min_count=2
Score-first legality: cache updated only AFTER segment scoring

Architecture

11L/512d U-Net, 26.93M params. LeakyReLU² (slope 0.5), XSA last 4, BigramHash 1536. GPTQ int6+zstd, late QAT. TTT disabled.

Reproduce

SEED=2045 MLP_ACT=leaky_relu_sq MLP_LEAKY_SLOPE=0.5 XSA_LAST_N=4 BIGRAM_VOCAB_SIZE=1536 ROPE_DIMS=24 NGRAM_EVAL_ORDER=5 NGRAM_EVAL_ALPHA=0.20 NGRAM_EVAL_MIN_COUNT=2 NGRAM_EVAL_BUCKETS=4194304 torchrun --nproc_per_node=8 train_gpt.py

8xH100 SXM, 600s training + ~190s eval. Training logs and submission.json included.

@deanbrr

11L/512d U-Net + legal score-first 5-gram eval interpolation. Inspired by @deanbrr's n-gram cache technique (PR openai#659). 3-seed results: seed 1337: 1.0451 (15.63MB) seed 42: 1.0471 (15.59MB) seed 2045: 1.0460 (15.64MB) mean: 1.0461 Run: SEED=2045 MLP_ACT=leaky_relu_sq MLP_LEAKY_SLOPE=0.5 \ XSA_LAST_N=4 BIGRAM_VOCAB_SIZE=1536 ROPE_DIMS=24 \ NGRAM_EVAL_ORDER=5 NGRAM_EVAL_ALPHA=0.20 \ torchrun --nproc_per_node=8 train_gpt.py Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

@deanbrr

3-seed logs (1337, 42, 2045) + submission.json + README. N-gram eval concept credited to @deanbrr (PR openai#659). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

newjordan · 2026-03-25T12:42:00Z

dissapointing I have to make a new PR for this as opponed to a commit to my better timed one.

valerio-oai · 2026-03-25T16:17:56Z

Hi @newjordan , thank you for the logs, but as I mentioned in my last comment on your PR, the run is still illegal on account of GPTQ calibration happening after training time (see the training logs logging 600s of training time and then GPTQ calibrating for 3.6s), meaning it is still accessing training data at eval time, which is disallowed.

newjordan · 2026-03-25T16:29:22Z

crushed... sorry to waste your time. back at it.

Octavian and others added 2 commits March 24, 2026 22:49

Podracing: add submission.json, training logs, README

d36bdec

3-seed logs (1337, 42, 2045) + submission.json + README. N-gram eval concept credited to @deanbrr (PR openai#659). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

notapplica mentioned this pull request Mar 25, 2026

Parameter Golf Live AI Commentary + Analysis / Ideas | every 10 minutes #140

Open

newjordan mentioned this pull request Mar 25, 2026

Podracing II: Electric Bugaloo — 0.9625 BPB (3-seed mean, all sub-0.964) #753

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Podracing: 1.0461 BPB (3-seed mean) — 5-gram eval + LeakyReLU²#706

Podracing: 1.0461 BPB (3-seed mean) — 5-gram eval + LeakyReLU²#706
newjordan wants to merge 2 commits intoopenai:mainfrom
newjordan:submission/podracing

newjordan commented Mar 25, 2026

Uh oh!

newjordan commented Mar 25, 2026

Uh oh!

valerio-oai commented Mar 25, 2026 •

edited

Loading

Uh oh!

newjordan commented Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

newjordan commented Mar 25, 2026

Results

Progression

5-gram Eval (score-first, legal)

Architecture

Reproduce

Uh oh!

newjordan commented Mar 25, 2026

Uh oh!

valerio-oai commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

newjordan commented Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

valerio-oai commented Mar 25, 2026 •

edited

Loading