openai · Muhtasham · Mar 25, 2026
diff --git a/records/track_non_record_16mb/2026-03-25_classical_4gram_10m_eval/README.md b/records/track_non_record_16mb/2026-03-25_classical_4gram_10m_eval/README.md
@@ -0,0 +1,74 @@
+# Classical 4-gram Artifact
+
+This is a non-record classical submission based on a discounted hashed 4-gram model exported as a compressed artifact and evaluated exactly on the full FineWeb validation split.
+
+The model is fully non-neural:
+- no transformer
+- no embeddings to train
+- no GPU dependence in the solver itself
+- no training-data access during evaluation beyond the saved artifact
+
+## Configuration
+
+- Track: `non-record-16mb`
+- Model: discounted hashed 4-gram with backoff to bigram and unigram
+- Artifact build data: first `10,000,000` training tokens
+- Artifact bytes: `14,310,783`
+- Code bytes (`train_gpt.py`): `57,801`
+- Total submission bytes: `14,368,584`
+
+Command used to build the artifact:
+
+```bash
+./.venv/bin/python records/track_non_record_16mb/2026-03-25_classical_4gram_10m_eval/train_gpt.py \
+  --skip-validation 1 \
+  --save-state /tmp/state_ng4_10000k_comp.zlib \
+  --train-pattern 'data/datasets/fineweb10B_sp1024/fineweb_train_*.bin' \
+  --warmup-tokens 10000000 \
+  --cache-windows '' \
+  --copy-contexts '' \
+  --doc-copy-contexts '' \
+  --absolute-discount 0.75 \
+  --ngram-contexts 3 \
+  --mix-backoff-experts 0
+```
+
+Command used for the final full-validation evaluation:
+
+```bash
+./.venv/bin/python records/track_non_record_16mb/2026-03-25_classical_4gram_10m_eval/train_gpt.py \
+  --max-tokens 0 \
+  --report-every 5000000 \
+  --load-state /tmp/state_ng4_10000k_comp.zlib \
+  --cache-windows '' \
+  --copy-contexts '' \
+  --doc-copy-contexts '' \
+  --absolute-discount 0.75 \
+  --ngram-contexts 3 \
+  --mix-backoff-experts 0
+```
+
+## Exact Metrics
+
+- Full validation tokens loaded: `62,021,846`
+- Predictions: `62,021,845`
+- Full-validation `val_bpb`: `1.91070694`
+- Full-validation wallclock: `571.97` seconds
+- Validation bytes: `151,080,891`
+
+Artifact build run:
+- warmup predictions: `9,999,999`
+- artifact build wallclock: `68.63` seconds
+
+This line is much weaker than the best neural submissions, but it now satisfies the mechanical submission constraints locally:
+- exact full-validation run
+- artifact under `16,000,000` bytes
+- single-file `train_gpt.py`
+- full-validation runtime under `10` minutes on this machine
+
+## Included Files
+
+- `train_gpt.py` — single-file classical solver
+- `submission.json` — metadata for the run
+- `train.log` — exact artifact-build stdout
+- `eval.log` — exact full-validation stdout
diff --git a/records/track_non_record_16mb/2026-03-25_classical_4gram_10m_eval/eval.log b/records/track_non_record_16mb/2026-03-25_classical_4gram_10m_eval/eval.log
@@ -0,0 +1,21 @@
+step=5000000 bpb=2.005436 tok_per_s=87059 weights=[ngram_4=1.000]
+step=10000000 bpb=1.997001 tok_per_s=101167 weights=[ngram_4=1.000]
+step=15000000 bpb=1.977351 tok_per_s=98463 weights=[ngram_4=1.000]
+step=20000000 bpb=1.962266 tok_per_s=100651 weights=[ngram_4=1.000]
+step=25000000 bpb=1.948778 tok_per_s=102760 weights=[ngram_4=1.000]
+step=30000000 bpb=1.943999 tok_per_s=104743 weights=[ngram_4=1.000]
+step=35000000 bpb=1.939619 tok_per_s=105619 weights=[ngram_4=1.000]
+step=40000000 bpb=1.932213 tok_per_s=106524 weights=[ngram_4=1.000]
+step=45000000 bpb=1.923477 tok_per_s=106111 weights=[ngram_4=1.000]
+step=50000000 bpb=1.922186 tok_per_s=107292 weights=[ngram_4=1.000]
+step=55000000 bpb=1.918598 tok_per_s=108088 weights=[ngram_4=1.000]
+step=60000000 bpb=1.912753 tok_per_s=108456 weights=[ngram_4=1.000]
+loaded_state_bytes=14310783
+loaded_warmup_predictions=9999999
+tokens_loaded=62021846
+predictions=62021845
+total_bytes=151080891
+val_bpb=1.91070694
+elapsed_seconds=571.97
+expert_weights:
+  ngram_4: weight=1.000000 avg_logloss_bits=4.654349
diff --git a/records/track_non_record_16mb/2026-03-25_classical_4gram_10m_eval/submission.json b/records/track_non_record_16mb/2026-03-25_classical_4gram_10m_eval/submission.json
@@ -0,0 +1,18 @@
+{
+  "author": "muhtasham",
+  "github_id": "Muhtasham",
+  "name": "Classical 4-gram Artifact",
+  "blurb": "Non-record classical submission: discounted hashed 4-gram model built from 10M train tokens, exported as a 14.31MB compressed artifact, and evaluated exactly on the full FineWeb validation split. Full-val BPB is 1.91070694 with full-val wallclock 571.97s and total submission size 14,368,584 bytes.",
+  "date": "2026-03-25T00:00:00Z",
+  "track": "non-record-16mb",
+  "val_loss": null,
+  "val_bpb": 1.91070694,
+  "pre_quant_val_loss": null,
+  "pre_quant_val_bpb": null,
+  "step_stop": null,
+  "wallclock_seconds": 68.63,
+  "bytes_total": 14368584,
+  "bytes_model_int8_zlib": 14310783,
+  "bytes_code": 57801,
+  "gpu": "local CPU"
+}
diff --git a/records/track_non_record_16mb/2026-03-25_classical_4gram_10m_eval/train.log b/records/track_non_record_16mb/2026-03-25_classical_4gram_10m_eval/train.log
@@ -0,0 +1,4 @@
+warmup_predictions=9999999
+warmup_elapsed_seconds=44.01
+saved_state_bytes=14310783
+elapsed_seconds=68.63