Skip to content

Conversation

@jahorton
Copy link
Contributor

@jahorton jahorton commented Nov 5, 2025

After much examination, I believe that in order to identify use of subsets of user input that converge to a common prefix for future extensions, we need to track and reference the right-hand index used to split input transforms when such splits occur. This will also assist with identification of cases where a previously-split input is to be remerged; the earlier component's "split" index (.end) will match the later component's .start index for such cases.

To clarify the requirements for considering subsets of user input to converge:

  1. The represented text for the token (and SearchSpace) must have the same codepoint length.
  2. The underlying InputSegment properties for each input must cover exactly the same range.
    1. They should share the exact same set of .transitionId entries.
    2. For each .transitionId entry, the .start and .end values must match precisely.

To restate this more plainly:

  1. They should represent the same total amount of text - as future delete-left operations must have the same net effect, regardless of how large the total deletion effect is.
  2. They should represent exactly the same keystrokes, and the same portions thereof if only parts of certain keystrokes apply.
    • Applying a suggestion based on a tokenization (and its SearchSpace) directly replaces this range of the input; this range should be clearly defined with consistent boundaries.
    • If this process only replaces part of a keystroke, knowing this fact may prove useful for other adjustments to be made... such as increasing the suggestion's range of coverage (due to affecting multiple tokens).

Build-bot: skip build:web
Test-bot: skip

@keymanapp-test-bot
Copy link

keymanapp-test-bot bot commented Nov 5, 2025

User Test Results

Test specification and instructions

User tests are not required

Test Artifacts

  • Web
    • KeymanWeb Test Home - build : all tests passed (no artifacts on BuildLevel "build")

@keymanapp-test-bot keymanapp-test-bot bot changed the title change(web): track right-hand split index for input source of tokenized transforms change(web): track right-hand split index for input source of tokenized transforms 🚂 Nov 5, 2025
@keymanapp-test-bot keymanapp-test-bot bot added this to the A19S15 milestone Nov 5, 2025
@github-actions github-actions bot added web/ web/predictive-text/ change Minor change in functionality, but not new labels Nov 5, 2025
@keyman-server keyman-server modified the milestones: A19S15, A19S16 Nov 8, 2025
@jahorton jahorton force-pushed the refactor/web/rename-and-doc-pending-tokenization branch from 6daa809 to 461fa4d Compare November 10, 2025 21:03
@jahorton jahorton force-pushed the change/web/track-righthand-split-index branch from e65ad7c to 5f1ce79 Compare November 10, 2025 21:12
@jahorton jahorton changed the base branch from refactor/web/rename-and-doc-pending-tokenization to refactor/web/search-path-splitting November 10, 2025 21:12
@jahorton jahorton force-pushed the refactor/web/search-path-splitting branch from 2fc98eb to e6cbc29 Compare November 11, 2025 22:06
@jahorton jahorton force-pushed the change/web/track-righthand-split-index branch from 220bf34 to b51dbc3 Compare November 11, 2025 22:24
@keyman-server keyman-server modified the milestones: A19S16, A19S17 Nov 22, 2025
@keyman-server keyman-server modified the milestones: A19S17, A19S18 Dec 6, 2025
@keyman-server keyman-server modified the milestones: A19S18, A19S19 Dec 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

change Minor change in functionality, but not new epic-autocorrect web/predictive-text/ web/

Projects

Status: Todo

Development

Successfully merging this pull request may close these issues.

3 participants