Skip to content

fix(server): CJK-aware charWidth — fixes box alignment on zh/ja/ko terminals#9

Merged
nanami-he merged 1 commit into
mainfrom
fix/cjk-ambiguous-width
Apr 29, 2026
Merged

fix(server): CJK-aware charWidth — fixes box alignment on zh/ja/ko terminals#9
nanami-he merged 1 commit into
mainfrom
fix/cjk-ambiguous-width

Conversation

@nanami-he
Copy link
Copy Markdown
Owner

What broke

On a Chinese-locale Windows 11 box (system code page 936), /pet browse returns visibly broken boxes:

╭────────────────────────────────────╮
│ 1. Owl                             │
├────────────────────────────────────┤
│      {◉,◉}                         │   ← right │ drifts past ╮
│      /)_)                          │
│    ——" "——                         │   ← same here
├────────────────────────────────────┤
│ Quiet, curious — the scholar who … │
╰────────────────────────────────────╯

The art uses several Unicode "Ambiguous East Asian Width" characters — (U+25C9), (U+2014), ✦/✧, , Λ, ×. Per UAX #11 these render as 2 columns on East Asian locale terminals (Chinese/Japanese/Korean Windows Terminal, iTerm2 under zh/ja/ko, …) and 1 column elsewhere. charWidth() was always returning 1, so stringWidth() under-counted and pad() left the right border too far out.

Fix

server/utils.ts now detects the locale and treats the relevant Ambiguous-width ranges as 2 cols when CJK:

  • POSIX env vars (LC_ALL / LC_CTYPE / LANG / LANGUAGE) — Mac/Linux happy path
  • Intl.DateTimeFormat().resolvedOptions().locale — Windows fallback (env vars are usually unset there but the system locale is e.g. zh-CN)

When CJK is detected, these ranges become 2-wide:

Range Block Examples used in art
0x2010–0x2027, 0x2030–0x205E General Punctuation
0x2150–0x218F Number Forms
0x2190–0x21FF Arrows
0x2200–0x22FF Mathematical Operators Λ (technically Greek but kept narrow)
0x2580–0x25FF Block Elements + Geometric Shapes
0x2600–0x26FF Misc Symbols
0x2700–0x27BF Dingbats ,

Box-drawing (0x2500–0x257F) is intentionally excluded — every mainstream terminal special-cases box drawings to 1-wide for TUI sanity even under a CJK locale. Including them here would double the border width.

charWidth / stringWidth / padDisplay all gained an optional explicit cjk parameter (defaulting to the auto-detected locale) so tests are deterministic regardless of where the suite happens to run.

Test plan

  • bun test — 305 → 307 pass (added explicit non-CJK + CJK + box-drawing cases)
  • Smoke test on Win11 (zh-CN, code page 936): charWidth('◉') === 2, charWidth('─') === 1, IS_CJK_LOCALE === true. animalCard() for Owl now produces a card whose right border lines up with the top.
  • Sanity check on macOS — confirm non-CJK locale still returns 1 for / and existing 12-wide art lines stay 12-wide.
  • After merge: npx petsonality@<next> /pet browse on a fresh CJK Windows box.

Out of scope (separate PR)

bun run build on Win11 still fails at build:art until #7 lands — that's an independent path-resolution bug, not coupled to this fix.

🤖 Generated with Claude Code

…terminals

Pet cards use box-drawing borders with content padded by stringWidth().
A handful of art characters — `◉` (U+25C9), `—` (U+2014), `✦/✧`, `★`,
`Λ`, `×` — fall in Unicode's "Ambiguous East Asian Width" class. On
CJK-locale terminals (Chinese/Japanese/Korean Windows Terminal, iTerm2
under zh/ja/ko, etc.) those render as 2 columns, but charWidth() was
returning 1 — so padding under-counted and the right `│` drifted past
the top `╮`. Reproduced visually on Win11 with system code page 936.

Detection order:
  1. POSIX env vars (LC_ALL / LC_CTYPE / LANG / LANGUAGE) — covers Mac
     and Linux.
  2. Intl.DateTimeFormat resolved locale — covers Windows where the env
     vars are usually unset but the system locale is e.g. "zh-CN".

When the locale resolves to zh/ja/ko, charWidth() additionally treats
General Punctuation, Geometric Shapes, Misc Symbols, Dingbats, Arrows,
and Math Operators as 2 cols. Box-drawing (0x2500–0x257F) is
intentionally excluded because every mainstream terminal special-cases
those to 1-wide for TUI sanity even under a CJK locale — including them
would double the border width.

charWidth/stringWidth/padDisplay now take an optional explicit `cjk`
parameter (defaulting to the auto-detected locale) so tests are
deterministic regardless of where the suite happens to run.

Surfaced via /pet browse on Win11 zh-CN — Owl/Labrador/Lion/etc. cards
visibly broken in user report.
@nanami-he nanami-he merged commit 94693f3 into main Apr 29, 2026
1 check passed
@nanami-he
Copy link
Copy Markdown
Owner Author

shipped in v0.4.4. CJK alignment now correct on zh/ja/ko terminals.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant