A 35-million-parameter transformer that pretends to be a goldfish. Hard-capped at a 96-token context. It literally cannot remember what you just said - and that's the point.
GlubLM Desk Pet - a goldfish companion that lives in your browser and forgets you between sentences.
GlubLM is two things:
-
A tiny language model (36.1M params, 40 MB ONNX) trained from scratch to impersonate a goldfish. Not a fine-tune of GPT-2 or anything else. Written in PyTorch, trained on a single RTX 3060, shipped to HuggingFace, PyPI, and your browser.
-
A desk pet PWA - a pixel-art goldfish that lives in a bowl in your browser, runs the full model client-side, swims around on its own, says things from a pool of 530 hand-curated idle phrases, and actually talks back when you poke it. Fully offline after first load. Adopt it here.
No backend. No API key. No telemetry. No inference server. Your conversation with the goldfish happens entirely in your tab, stays in your tab, and is forgotten by the goldfish before you close the tab.
Most LLM projects chase capability. This one runs the other direction and chases constraint.
Inspired by Ted Lasso's meditation on the goldfish as "the happiest animal on earth" because of its 10-second memory, GlubLM makes forgetting a physical, architectural property of the model:
The context window is 96 tokens. Hard cap. No sliding window, no summarization, no KV cache tricks. After about two short sentences, the model has literally no access to what came before. It has to start over, every time.
The result is a model that can't remember you, can't hold a grudge, can't accumulate bias across a conversation, and can't be convinced of anything. It's relentlessly present. Every interaction is its first. That's not a limitation we're working around - that's the whole point.
And then we put it in a pixel-art bowl on your desk so you can see it happen.
| Option | What you get | How |
|---|---|---|
| Adopt the Desk Pet | Pixel-art goldfish, PWA installable, 530 idle phrases, chat with the fish, offline after first load | Open the URL. Click install if you want. |
| Simple browser chat | Minimal chat UI, same model | Open the URL. Type something. |
| HuggingFace Space | Gradio interface | Open the URL. |
| Python | CLI + programmatic API | pip install glublm && glublm chat |
Everything is free, open source, and runs locally or in your browser. No sign-up, no API key, no paywall.
A goldfish you adopt in your browser. It swims, it talks to itself, it forgets everything. You can talk to it and it'll respond - then forget. You can poke it and it'll wiggle. Leave it alone long enough and it'll fall asleep. Close the tab and it'll tap on the glass asking where you went.
- Pixel-art goldfish in a GBA-style fishbowl rendered on HTML canvas. 240px-wide internal buffer, nearest-neighbor upscale to your viewport. 8x8 water tile patterns with 2-frame swap animation (the way Pokemon actually renders water). 3 depth zones, dithered refraction shadows, caustic light patterns on the gravel, 4 swaying plants.
- 12 animation states: idle swim, talk, happy, sad, sleep, eat, bump glass, forget, excited, wiggle, bubble blow, turn around. Sprites are procedurally generated at 16x16 so every pixel is intentional.
- 530 hand-curated idle phrases in 13 time-aware categories. The goldfish is different in the morning than at night. It gets bored if you ignore it. It gets philosophical at random.
- Real chat with the model. When you type something, the fish swims to the center, thinks for 1-3 seconds (ONNX Runtime Web on WASM), and speaks back. Then it forgets everything you both just said.
- Click the fish to make it wiggle, double-click for excited, long-press for happy. Click the water for a splash.
- Installable PWA. First visit downloads the 40 MB model once. After that it runs fully offline. Add it to your home screen and forget it's there until you open a new tab.
- Notifications. The goldfish taps on the glass when you're away. "hey! where did you go?" "the water is nice today." "i forgot what i was going to say." (Note: no push server, so notifications only fire while the browser is running.)
- Settings: name your fish, tune notification frequency (0-8 per day), done. Everything persisted in
localStorage.
| Layer | Tech |
|---|---|
| Rendering | HTML5 Canvas 2D, no libraries, ES modules, no build step |
| Inference | ONNX Runtime Web (WASM backend), ~1-3s latency per response |
| Model | GlubLM v0.3.1 (36.1M params, INT8 quantized to 40 MB) |
| Tokenizer | Custom Byte-Level BPE decoder in vanilla JS (no tokenizer.js dependency) |
| PWA | Web App Manifest + Service Worker with streaming cache for the 40 MB model |
| Offline | Cache-first strategy, two caches: static assets (20 files) + model (streaming fetch) |
| State | Finite state machine with priority transitions, procedural sprite engine with cached frames |
Total: ~2,250 lines of vanilla JS across 10 modules. No React, no build tooling, no transpiler, no framework. Just ES modules served statically.
+--------------------------------------------------+
| desk-pet/ (PWA, ~2250 LOC) |
| |
| +------------------+ +---------------------+ |
| | Canvas pixel | | Service Worker | |
| | buffer (240px) | | (offline + model | |
| | + screen upscale| | streaming cache) | |
| +--------+---------+ +---------------------+ |
| | |
| +--------+------------------------------------+ |
| | Engine (8 ES modules) | |
| | bowl sprites movement state-machine | |
| | bubbles speech idle canvas | |
| +---------------------------------------------+ |
| | |
| +--------+------------------------------------+ |
| | ONNX Runtime Web (WASM) | |
| | -> glublm_35m_v4.onnx (40 MB, INT8) | |
| +---------------------------------------------+ |
+--------------------------------------------------+
| |
v v
No backend Your browser tab
No API calls (fully offline after 1st load)
GlubLM is a decoder-only transformer, 8 blocks x hidden 640, modern stack all the way down:
input -> RMSNorm -> CausalAttention(RoPE on Q/K) -> residual
-> RMSNorm -> SwiGLU FFN -> residual -> output
No bias terms anywhere. Weight-tied LM head. BF16 training. Pre-norm layout. Llama-style.
| Component | Value |
|---|---|
| Parameters | 36,055,680 (36.1M) |
| Blocks | 8 |
| Hidden dim | 640 |
| Attention heads | 10 (64 per head) |
| FFN intermediate | 1280 (SwiGLU, effective 2560) |
| Vocab | 5,120 (Byte-Level BPE) |
| Max context | 96 tokens (hard cap) |
| Training dtype | BF16 |
| Dropout | 0.1 |
Full details in docs/ARCHITECTURE.md.
55K balanced samples, generated by a team of four coordinated Claude agents instead of hand-authored templates:
- Generator (Haiku 4.5): produces conversations in batches, 50 samples per API call
- Critic (Sonnet 4.6): reviews each sample for persona adherence, returns accept/reject/fix
- Diversifier (Haiku 4.5): flags n-gram repetition across batches
- Guardian (Sonnet 4.6): hard filter that rejects any sample containing Ted Lasso characters, football/coach references, or other forbidden concepts
The final mix:
| Component | Samples | Character |
|---|---|---|
| v1 (best subsample) | 20K | Poetic foundation |
| v2 (supplement) | 15K | Balance |
| v3 (conversational) | 5K | Companion tone |
| v4 (forgetful) | 15K | Obligatory mid-sentence forgetting |
| Total | 55K balanced | 21% forgetting, 62% questions, 31% interjections |
Full methodology in docs/DATASET.md.
Trained on a single RTX 3060 12GB in BF16, 15 epochs on 55K samples, AdamW with cosine schedule + 5% warmup, LR 3e-4 peak, batch 64, weight decay 0.1, gradient clipping 1.0.
Final test perplexity: 3.28. Loss 3.18 -> 1.21 over the full run.
Full hyperparams and logs in docs/TRAINING.md.
GlubLM is directly inspired by GuppyLM (~8.7M params, template-composed dataset, vanilla transformer). GlubLM explicitly tests the design choices GuppyLM argued against.
| Dimension | GuppyLM | GlubLM |
|---|---|---|
| Parameters | 8.7M | 36.1M |
| Layers x Hidden | 6 x 384 | 8 x 640 |
| Activation | ReLU | SwiGLU |
| Normalization | LayerNorm | RMSNorm |
| Position encoding | Learned | Rotary (RoPE) |
| Vocabulary | 4,096 | 5,120 |
| Context length | 128 | 96 (hard cap) |
| Dataset | 60K templates | 55K LLM-generated |
| Training | Colab T4 | RTX 3060 local |
| Browser demo | Basic chat | Full PWA desk pet |
| License | MIT | AGPL-3.0 |
GuppyLM explicitly chose vanilla transformer components, arguing modern ops don't help below 40M params. GlubLM tests that empirically. Full results and qualitative samples in docs/COMPARISONS.md.
TL;DR of the comparison: at 36M params, modern ops (RoPE + SwiGLU + RMSNorm) produce noticeably more coherent persona-locked outputs, without costing anything.
pip install glublm
# Chat with the model (downloads weights from HF automatically on first run)
glublm chat --prompt "tell me about kindness"
# Or download manually
python -c "from huggingface_hub import hf_hub_download; \
hf_hub_download('DenSec02/glublm-18m', 'model.safetensors', local_dir='./ckpt')"git clone https://github.com/Den-Sec/glublm
cd glublm
pip install -e ".[dev,deploy]"
# Generate the dataset (requires ANTHROPIC_API_KEY)
glublm generate-data --target 55000 --out data/glublm_55k_v4.json
# Train
glublm train --data data/glublm_55k_v4.json --epochs 15 --batch-size 64 --lr 3e-4
# Sample from your trained checkpoint
glublm chat --ckpt checkpoints/glublm_35m_v4.pt --tokenizer checkpoints/tokenizer.jsonTraining takes ~15 minutes on an RTX 3060. See docs/TRAINING.md for the full recipe and a Colab notebook.
git clone https://github.com/Den-Sec/glublm
cd glublm
python -m http.server 8000 --directory desk-petOpen http://localhost:8000. That's it. No npm, no bundler, no transpiler. Just static files.
glublm/
├── src/glublm/ # Python package: model, training, inference, CLI
├── data_gen/ # Multi-agent dataset generation pipeline
├── tools/ # ONNX export, HF upload, benchmark
├── tests/ # Pytest suite (77 tests green)
├── web/ # Simple browser demo (legacy)
├── desk-pet/ # Pixel-art PWA desk pet (this is the star)
│ ├── engine/ # 8 ES module engine (canvas, bowl, sprites, ...)
│ ├── inference/ # ONNX model + BPE tokenizer in vanilla JS
│ ├── data/idle-phrases.json # 530 curated goldfish phrases
│ ├── assets/icons/ # PWA icons
│ ├── manifest.json # Web app manifest
│ ├── sw.js # Service worker (offline + streaming model cache)
│ ├── model.onnx # 40 MB quantized model (ships with the PWA)
│ └── tokenizer.json # BPE tokenizer
├── hf/ # HuggingFace model card + dataset card
├── space/ # HuggingFace Space (Gradio)
├── notebooks/ # Colab training notebook
├── checkpoints/ # Trained weights (ignored, downloadable from HF)
└── docs/
├── ARCHITECTURE.md # Model architecture deep dive
├── DATASET.md # Multi-agent generation methodology
├── TRAINING.md # Hyperparameters + training runs
├── COMPARISONS.md # GlubLM vs GuppyLM, empirical
└── superpowers/ # Design specs + implementation plans
├── specs/ # Feature specs
└── plans/ # Phased execution plans
| Metric | Value |
|---|---|
| Model parameters | 36,055,680 |
| Training dataset | 55K samples (LLM-generated, multi-agent) |
| Test perplexity | 3.28 |
| ONNX size (INT8) | 39.7 MB |
| Inference latency (WASM) | ~1-3s per response (desktop), ~3-5s (mobile) |
| Desk Pet LOC | ~2,250 (vanilla JS, 10 modules) |
| Idle phrases | 530 across 13 categories |
| Animation states | 12 (fish FSM) |
| Time to build the desk pet (design + code + deploy) | ~7 hours |
| Time the fish remembers you | ~10 seconds |
- Core transformer + training loop
- Multi-agent dataset generation pipeline
- ONNX export + browser inference
- HuggingFace Hub model + dataset
- PyPI package
- HuggingFace Space
- Pixel-art desk pet PWA
- Cloudflare Worker push relay (true background notifications)
- Hand-drawn sprite art (current ones are procedurally generated)
- Mobile app wrappers (iOS / Android)
- Raspberry Pi build (same engine, framebuffer backend)
- ESP32 version (extreme constrained hardware, needs a different model)
- Italian variant (
PesceRossoLM) - Fine-tuned variants (grumpy goldfish, poet goldfish, philosopher goldfish)
Contributions welcome. The project is small enough that one person can understand all of it.
Good first issues:
- Add a new idle phrase category
- Draw actual pixel art sprites to replace the procedural ones
- Port the engine to a second rendering backend (WebGL, WebGPU, framebuffer)
- Add a new animation state
- Test on iOS Safari and file PWA bugs
- Add more tests (the Python side has 77, the JS side has zero right now)
Before you start a large change, open an issue. The goldfish appreciates a heads-up.
AGPL-3.0-or-later. If you run GlubLM as part of a network service, you are required to share your source code with your users. For most local or browser uses this changes nothing - it's for keeping the project genuinely open when wrapped in a commercial product.
See LICENSE for the full text.
- GuppyLM by Arman BD - the original tiny fish-persona model. Without it, this project wouldn't exist.
- Ted Lasso - "be a goldfish" is not just a line, it's a design constraint.
- Anthropic Claude - the multi-agent team that generated the dataset (Haiku + Sonnet).
- ONNX Runtime Web team - for making 40 MB models runnable in a browser tab.
- HuggingFace - for hosting everything and making sharing small models zero-friction.
- Every pixel artist on the internet whose GBA water tile I studied to get the bowl to look right.
If you use GlubLM in research, please cite:
@software{glublm_2026,
author = {Sepede, Dennis},
title = {GlubLM: a 36M goldfish language model with a 10-second memory},
year = {2026},
url = {https://github.com/Den-Sec/glublm},
note = {Includes Desk Pet PWA: https://den-sec.github.io/glublm/desk-pet/}
}Adopt the Desk Pet - HuggingFace - Dataset - Space - Architecture - Comparisons
Made with water, warmth, and a dangerous amount of pixel art.
