Skip to content

Conversation

@qkaiser
Copy link
Contributor

@qkaiser qkaiser commented Dec 15, 2025

The discriminator previously used str(random()) pieces, which could emit scientific notation (e.g. .756153012152602e-05) and generate invalid BOM ref fragments.

  • Switch to uuid4().int for each segment so we always produce integer-only values while maintaining uniqueness.
  • Add a regression test to lock the BomRef.. format.

uuid4() gives ~122 bits of entropy (non-crypto), compared to ~53 bits from random.random(), and the prior decimal-string slicing wasted some of those bits while allowing scientific notation.

Went with uuid4 after exploring in-memory addresses like the JavaScript CycloneDX library and the PHP CycloneDX library which uses a time based approach.


A generalized approach for BomRef emission would be nice I think. Both from a resulting format and generation standpoint. Specifically:

  • PHP library uses uniqid which is time based and results in a timestamp in microseconds expressed in hex suffixed by a random digit
  • JavaScript uses in-memory addresses but also Math.random() which emits two dot separated base32 encoded value
  • Python is currently using math.random() which emits two dot separated 16 digits integer, sometimes in scientific notation

Instead of having custom implementations for each library offering support for CycloneDX, they could all rely on unique identifiers such as uuid4 which is natively supported by each language. This would offer strong guarantees of uniqueness along with a common format.

@qkaiser qkaiser requested a review from a team as a code owner December 15, 2025 11:58
The discriminator previously used str(random()) pieces, which could emit
scientific notation (e.g. 4.1e-05) and generate invalid BOM ref
fragments. Switch to uuid4().int for each segment so we always produce
integer-only values while maintaining uniqueness. Add a regression test
to lock the BomRef.<int>.<int> format. uuid4() gives ~122 bits of
entropy (non-crypto), compared to ~53 bits from random.random(), and the
prior decimal-string slicing wasted some of those bits while allowing
scientific notation.

Signed-off-by: Quentin Kaiser <quentin.kaiser@onekey.com>
@qkaiser qkaiser force-pushed the fix/bom-ref-discriminator branch from fb05b8c to b0a9f50 Compare December 15, 2025 11:59
@jkowalleck
Copy link
Member

Thanks for the report. I do have some questiosn

[...] generate invalid BOM ref fragments

invalid to what?

Instead of having custom implementations for each library offering support for CycloneDX, they could all rely on unique identifiers such as uuid4 which is natively supported by each language. This would offer strong guarantees of uniqueness along with a common format.

why exactly?

I dont see any reason to streamline a thing that has basically no rule nor format.
I mean, BomRef have only one constraint: being unique in the CycloneDX context(bom) they are used in.

Please explain why you feel this needs changing.

@jkowalleck jkowalleck added the question Further information is requested label Dec 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

question Further information is requested

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants