EnglishG2P.retokenize crashes with "Index out of range" on any currency symbol followed by a number
Summary
EnglishG2P.retokenize(_:) traps with Fatal error: Index out of range (Swift/ContiguousArrayBuffer.swift, EXC_BREAKPOINT/SIGTRAP) when phonemizing any text that contains a $, £, or € immediately followed by a number — e.g. "$100", "£19.99", "€5". This is deterministic and reproducible. Because phonemization happens on the live render path, it crashes the host app (we hit it via KokoroSwift in an iOS TTS app — any paragraph mentioning a dollar amount kills the app).
Root cause
In retokenize, the outer loop is for (i, token) in tokens.enumerated() over the parameter array, but inside the loop body a local var tokens: [MToken] shadows the parameter. In the currency branch:
} else if currency != nil {
if token.tag != .number {
currency = nil
} else if j + 1 == tokens.count && (i + 1 == tokens.count || tokens[i + 1].tag != .number) {
// ^^^^^^^^^^^^
// `i` is the OUTER index, but `tokens` here is the inner (shadowed)
// array → tokens[i + 1] is out of range whenever the outer index
// exceeds the inner array's count (the common case).
token._.currency = currency
}
}
tokens[i + 1] subscripts the inner shadowed array using the outer loop index i. The short-circuit i + 1 == tokens.count almost never fires (it compares an outer index against the inner count), so tokens[i + 1] is evaluated and traps.
Reproduction
let g2p = EnglishG2P(british: false)
_ = g2p.phonemize(text: "It costs $100.") // crashes
(Trigger requires a currency symbol token next to a number, so Lexicon.currencies — $/£/€ — sets currency != nil.)
Stack trace (from a device crash, MisakiSwift 1.0.3)
Array._checkSubscript → Array.subscript.getter
MisakiSwift EnglishG2P.retokenize(_:)
MisakiSwift EnglishG2P.phonemize(text:performPreprocess:)
KokoroSwift MisakiG2PProcessor.process(input:)
KokoroSwift KokoroTTS.phonemizeText → generateAudio
Good news / the ask
This appears already fixed on the default branch — the inner array was renamed tokens → subtokens, un-shadowing the parameter so tokens[i + 1] correctly indexes the outer array. However, there is no tagged release containing the fix (latest tag is 1.0.6; KokoroSwift/kokoro-ios currently resolves MisakiSwift 1.0.3, which still crashes).
Could you cut a release that includes the retokenize un-shadowing fix? That would let downstream consumers (KokoroSwift, and apps built on it) pin a non-crashing version without vendoring.
Possibly related: #4 (closed, "index out of range").
Thanks for MisakiSwift — it's great to have a native Swift G2P.
EnglishG2P.retokenizecrashes with "Index out of range" on any currency symbol followed by a numberSummary
EnglishG2P.retokenize(_:)traps withFatal error: Index out of range(Swift/ContiguousArrayBuffer.swift,EXC_BREAKPOINT/SIGTRAP) when phonemizing any text that contains a$,£, or€immediately followed by a number — e.g."$100","£19.99","€5". This is deterministic and reproducible. Because phonemization happens on the live render path, it crashes the host app (we hit it viaKokoroSwiftin an iOS TTS app — any paragraph mentioning a dollar amount kills the app).Root cause
In
retokenize, the outer loop isfor (i, token) in tokens.enumerated()over the parameter array, but inside the loop body a localvar tokens: [MToken]shadows the parameter. In the currency branch:tokens[i + 1]subscripts the inner shadowed array using the outer loop indexi. The short-circuiti + 1 == tokens.countalmost never fires (it compares an outer index against the inner count), sotokens[i + 1]is evaluated and traps.Reproduction
(Trigger requires a currency symbol token next to a number, so
Lexicon.currencies—$/£/€— setscurrency != nil.)Stack trace (from a device crash, MisakiSwift 1.0.3)
Good news / the ask
This appears already fixed on the default branch — the inner array was renamed
tokens→subtokens, un-shadowing the parameter sotokens[i + 1]correctly indexes the outer array. However, there is no tagged release containing the fix (latest tag is1.0.6;KokoroSwift/kokoro-ioscurrently resolves MisakiSwift1.0.3, which still crashes).Could you cut a release that includes the
retokenizeun-shadowing fix? That would let downstream consumers (KokoroSwift, and apps built on it) pin a non-crashing version without vendoring.Possibly related: #4 (closed, "index out of range").
Thanks for MisakiSwift — it's great to have a native Swift G2P.