Skip to content

seadeef/Phoneme

Repository files navigation

Phoneme

A literacy app pairing handwriting practice with phoneme-level pronunciation feedback. The frontend is a multi-page Vite app (object-box lessons, writing pad, teacher portal, analytics) and the backend is a Cloudflare Worker serving the API plus the built static assets, backed by D1 + KV.

Repo layout

frontend/         Vite multi-page app (object-box, pad, teacher portal, ...)
workers/api/      Cloudflare Worker: API routes, D1 migrations, asset serving
g2p.kv.json       Aligned CMU pronouncing dictionary, KV-bulk-put format
data/             UJI handwriting dataset parser + template builder
lib/              Shared lib (e.g. templates.json consumed by stroke worker)
scripts/          One-off Node scripts (data filtering, matcher tests)
uji+pen+characters, uji_strokes.json, ujiv2.txt   Raw + derived UJI dataset
recordings/       Bundled audio assets

Prerequisites

  • Node.js 20+ and npm
  • Python 3.10+ (only needed if you want to regenerate the UJI templates)
  • A Cloudflare account if you intend to deploy; local dev runs offline with wrangler dev using a local D1 + KV simulator.

First-time setup

git clone <repo> phoneme
cd phoneme
npm install

The predev:frontend / prebuild hooks run scripts/filter_writers.mjs, which trims uji_strokes.json down to the writers shipped to the browser and writes frontend/public/uji_strokes.json. Override the writer set with PHONEME_WRITERS=tst_UJI_W12,trn_UJI_W01 npm run dev:frontend if needed.

Apply D1 migrations (local)

npx wrangler d1 migrations apply phoneme --local --config workers/api/wrangler.jsonc

The migrations under workers/api/migrations create the tables used for pronunciation/spelling/stroke events, teacher classes, and session reports.

Seed the G2P KV namespace (local)

The API reads grapheme-to-phoneme alignments from a KV namespace bound as G2P. Seed the local namespace from the bundled snapshot:

npm run seed:kv

That runs wrangler kv bulk put --binding=G2P --local against g2p.kv.json.

Configure secrets

The Worker reads these from env at runtime. Public-ish vars are checked in inside workers/api/wrangler.jsonc; the rest must be supplied as secrets. For local dev, create workers/api/.dev.vars:

ANTHROPIC_API_KEY=sk-ant-...
ELEVENLABS_API_KEY=...
SPEECHACE_API_KEY=...
GEMINI_API_KEY=...
GEMINI_MODEL=gemini-2.0-flash
TRIPO_API_KEY=...
TRIPO_API_BASE=https://api.tripo3d.ai
JWT_SECRET=dev-jwt-secret-change-me
SESSION_SECRET=dev-session-secret-change-me

Only the keys for features you exercise are strictly required:

Feature Keys needed
TTS in lessons ELEVENLABS_API_KEY (+ voice/model vars)
Pronunciation scoring SPEECHACE_API_KEY
Object-Box wish transcribe GEMINI_API_KEY
Object-Box 3D generation TRIPO_API_KEY
Coach / LLM feedback ANTHROPIC_API_KEY
Teacher login (Google SSO) GOOGLE_CLIENT_ID (in wrangler.jsonc) + JWT_SECRET, SESSION_SECRET

For production, use npx wrangler secret put <NAME> --config workers/api/wrangler.jsonc.

Running locally

The combined dev command starts Vite (web) and wrangler dev (API) side by side:

npm run dev

Vite proxies API calls in dev; in production the Worker serves both the API and the built workers/api/dist assets via the ASSETS binding.

Run them individually if you prefer:

npm run dev:frontend   # vite only
npm run dev:api        # wrangler dev only

Pages

Each HTML entrypoint under frontend/ is its own page:

  • / (index.html) - landing
  • /object-box.html - object-box lesson flow
  • /pad.html - free-form handwriting pad
  • /teacher-login.html, /classroom-setup.html, /session-console.html, /analytics-dashboard.html - teacher portal

Type checking

npm run typecheck          # both projects
npm run typecheck:frontend
npm run typecheck:api

Building & deploying

npm run build              # data:filter + vite build -> workers/api/dist
npm run deploy             # build + wrangler deploy

wrangler deploy uses the D1 database and KV namespace IDs in workers/api/wrangler.jsonc. To target your own Cloudflare account:

  1. npx wrangler d1 create phoneme --config workers/api/wrangler.jsonc and paste the returned database_id into the config.
  2. npx wrangler kv namespace create G2P and replace the id / preview_id accordingly.
  3. Apply migrations remotely: npx wrangler d1 migrations apply phoneme --remote --config workers/api/wrangler.jsonc.
  4. Seed KV remotely (drop the --local flag from the seed:kv script).
  5. npx wrangler secret put ... for every required secret listed above.

Regenerating data (optional)

The repo already ships derived artifacts (uji_strokes.json, lib/templates.json, g2p.kv.json), so most contributors never need to run these.

python3 data/parse_uji.py      # raw UJI dump -> uji_strokes.json
npm run templates              # medoid templates -> lib/templates.json

Troubleshooting

  • filter_writers: cannot read uji_strokes.json - run python3 data/parse_uji.py to regenerate it from uji+pen+characters/.
  • API returns 500 on TTS / scoring - missing secret in workers/api/.dev.vars; check the table above.
  • D1 errors on first request - migrations weren't applied; rerun the wrangler d1 migrations apply ... --local command.
  • KV lookups return null - the local G2P namespace wasn't seeded; run npm run seed:kv.

About

Google Playa Vista Break it Build it Hackathon 1st Place Winner + Most Creative

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors