Skip to content

Conversation

@Nepomuk5665
Copy link
Contributor

Summary

  • Fixes O(n²) performance issue in chunkOnModifiedUtf8ByteSize that caused significant build time overhead
  • Implements incremental byte size calculation instead of creating substrings for each character
  • Adds a helper function to calculate Modified UTF-8 byte size per character

Background

This addresses the performance issue reported in #350. The original implementation created a new substring and converted it to a byte array for every character in the input string:

// Before: O(n²) - substring + toByteArray for each iteration
val charModifiedUtf8ByteArraySize =
    this
        .substring(nextChunkStart, i + 1)
        .let { chunk -> chunk.toByteArray().size + chunk.count { char -> char == '\u0000' } }

This is now replaced with an O(n) approach that calculates byte sizes incrementally:

// After: O(n) - direct calculation per character
val charByteSize = this[i].modifiedUtf8ByteSize()

Technical Details

The new modifiedUtf8ByteSize() helper function handles Modified UTF-8 encoding rules:

  • U+0000 (null): 2 bytes (C0 80 encoding - special for Modified UTF-8)
  • U+0001-U+007F: 1 byte
  • U+0080-U+07FF: 2 bytes
  • U+0800-U+FFFF: 3 bytes

Testing

All existing tests in CharSequenceUtf8Test pass, confirming behavioral correctness.

The previous implementation created a new substring and converted it to
a byte array for every character in the input string, resulting in
O(n²) time complexity. This caused significant build time overhead as
reported in airbnb#350.

This fix calculates character byte sizes incrementally using the
Modified UTF-8 encoding rules directly, reducing time complexity to O(n).

The new modifiedUtf8ByteSize() helper function handles:
- U+0000 (null): 2 bytes (C0 80 encoding)
- U+0001-U+007F: 1 byte
- U+0080-U+07FF: 2 bytes
- U+0800-U+FFFF: 3 bytes

All existing tests pass, confirming behavioral correctness.
@rossbacher
Copy link
Collaborator

This is great! thank you very much!

Can you fix the Ci issues? Just run ./gradlew check javaDoc and fix the lint issues by running formatKotlin on the broken module,.

Copy link
Collaborator

@rossbacher rossbacher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Great fix!

@rossbacher rossbacher merged commit 1a91416 into airbnb:master Jan 23, 2026
1 check passed
renovate bot added a commit to keelim/all that referenced this pull request Jan 29, 2026
This PR contains the following updates:

| Package | Change |
[Age](https://docs.renovatebot.com/merge-confidence/) |
[Confidence](https://docs.renovatebot.com/merge-confidence/) |
|---|---|---|---|
|
[com.airbnb:deeplinkdispatch-processor](https://redirect.github.com/airbnb/deeplinkdispatch)
| `7.2.1` → `7.2.2` |
![age](https://developer.mend.io/api/mc/badges/age/maven/com.airbnb:deeplinkdispatch-processor/7.2.2?slim=true)
|
![confidence](https://developer.mend.io/api/mc/badges/confidence/maven/com.airbnb:deeplinkdispatch-processor/7.2.1/7.2.2?slim=true)
|
|
[com.airbnb:deeplinkdispatch](https://redirect.github.com/airbnb/deeplinkdispatch)
| `7.2.1` → `7.2.2` |
![age](https://developer.mend.io/api/mc/badges/age/maven/com.airbnb:deeplinkdispatch/7.2.2?slim=true)
|
![confidence](https://developer.mend.io/api/mc/badges/confidence/maven/com.airbnb:deeplinkdispatch/7.2.1/7.2.2?slim=true)
|

---

> [!WARNING]
> Some dependencies could not be looked up. Check the Dependency
Dashboard for more information.

---

### Release Notes

<details>
<summary>airbnb/deeplinkdispatch
(com.airbnb:deeplinkdispatch-processor)</summary>

###
[`v7.2.2`](https://redirect.github.com/airbnb/DeepLinkDispatch/releases/tag/7.2.2):
DeepLinkDispatch v7.2.2

[Compare
Source](https://redirect.github.com/airbnb/deeplinkdispatch/compare/7.2.1...7.2.2)

##### What's Changed

- Update AGP to 8.13.2 by
[@&#8203;rossbacher](https://redirect.github.com/rossbacher) in
[airbnb/DeepLinkDispatch#385](https://redirect.github.com/airbnb/DeepLinkDispatch/pull/385)
- Optimize chunkOnModifiedUtf8ByteSize from O(n²) to O(n) by
[@&#8203;Nepomuk5665](https://redirect.github.com/Nepomuk5665) in
[airbnb/DeepLinkDispatch#386](https://redirect.github.com/airbnb/DeepLinkDispatch/pull/386)
- Upgrade GitHub Actions for Node 24 compatibility by
[@&#8203;salmanmkc](https://redirect.github.com/salmanmkc) in
[airbnb/DeepLinkDispatch#390](https://redirect.github.com/airbnb/DeepLinkDispatch/pull/390)
- Centralize toolchain and target compatibility version. by
[@&#8203;rossbacher](https://redirect.github.com/rossbacher) in
[airbnb/DeepLinkDispatch#388](https://redirect.github.com/airbnb/DeepLinkDispatch/pull/388)
- Fix chunking algo for surrogate pairs by
[@&#8203;rossbacher](https://redirect.github.com/rossbacher) in
[airbnb/DeepLinkDispatch#389](https://redirect.github.com/airbnb/DeepLinkDispatch/pull/389)

##### New Contributors

- [@&#8203;Nepomuk5665](https://redirect.github.com/Nepomuk5665) made
their first contribution in
[airbnb/DeepLinkDispatch#386](https://redirect.github.com/airbnb/DeepLinkDispatch/pull/386)
- [@&#8203;salmanmkc](https://redirect.github.com/salmanmkc) made their
first contribution in
[airbnb/DeepLinkDispatch#390](https://redirect.github.com/airbnb/DeepLinkDispatch/pull/390)

**Full Changelog**:
<airbnb/DeepLinkDispatch@7.2.1...7.2.2>

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Enabled.

♻ **Rebasing**: Whenever PR is behind base branch, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about these
updates again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR was generated by [Mend Renovate](https://mend.io/renovate/).
View the [repository job
log](https://developer.mend.io/github/keelim/all).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0Mi45Mi4xIiwidXBkYXRlZEluVmVyIjoiNDIuOTIuMSIsInRhcmdldEJyYW5jaCI6ImRldmVsb3AiLCJsYWJlbHMiOlsiVVBEQVRFLW1pbm9yLXBhdGNoIl19-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants