Strip top-level Ion annotation in KFX parse_entity_ion#19
Open
Imaclean74 wants to merge 2 commits into
Open
Conversation
Some KFX containers tag entities with an Ion type annotation, e.g.
\`\$490::{ ... }\`, where the outer Ion value is an Annotated wrapper
around the actual entity struct. Every callsite of \`parse_entity_ion\`
in \`import/kfx.rs\` then calls \`.as_struct()\` (directly or via
\`get_field()\`) on the returned value — which returns \`None\` for an
\`Annotated\`, silently dropping the entity's fields.
Strip the wrapper once at the parse boundary so all importer paths can
treat the returned value as the entity itself, consistent with how
\`bin/kfx-dump.rs\` already unwraps annotated values at every consumer.
This matches the Ion spec's treatment of annotations as schema-level
metadata that does not change the value's logical shape:
https://amazon-ion.github.io/ion-docs/docs/spec.html#annot
Adds a 641-byte (501 bytes gzipped) synthetic KFX fixture with a single `\$490`-annotated `book_metadata` entity carrying eight categorised metadata fields (title/author/publisher/language/ASIN/book_id/cde/issue_date). `tests/kfx_annotated_metadata.rs` opens the fixture via the public `Book::open` API and asserts that all metadata fields round-trip. Before the parent commit's fix, this test fails because the importer's `.as_struct()` calls return `None` for the annotated value and every field comes back empty. `examples/build_annotated_kfx_fixture.rs` is a small generator that uses the public `kfx::serialization` helpers to rebuild the fixture from synthetic data. Run with `cargo run --release --example build_annotated_kfx_fixture` to regenerate.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Some KFX containers tag entities with an Ion type annotation, e.g.
$490::{ ... }, where the outer Ion value is anAnnotatedwrapper aroundthe entity struct. Every callsite of
parse_entity_ioninimport/kfx.rsthen reaches for the struct via
.as_struct()(directly or viaget_field),which returns
Nonefor anAnnotatedvalue — silently dropping theentity's fields and producing a book with empty metadata.
We've observed this on multiple real-world KFX-ZIP packages: the inner
.azw.mdmetadata sidecar carries a$490::{ ... }-wrappedbook_metadataentity, so
title,authors,publisher,language,identifier, etc.all return their defaults until the wrapper is stripped. A synthetic
fixture reproducing the issue is included.
This patch strips the wrapper once at the parse boundary so all importer
paths can treat the returned value as the entity itself, consistent with how
bin/kfx-dump.rsalready unwraps annotated values at every consumer (~17manual unwraps in that binary).
Changes
src/import/kfx.rs:parse_entity_ionnow matches on the parsed valueand returns the inner value when it is an
IonValue::Annotated. Existingbehaviour for un-annotated values is unchanged.
The fix uses a move (not the existing
unwrap_annotated()accessor) toavoid a clone on the hot import path;
unwrap_annotated()returns areference and would force the caller to clone.
Test
tests/fixtures/annotated_metadata.kfx.gz(501 bytes) — a 641-bytesynthetic KFX container with a single
\$490::{ ... }-annotatedbook_metadataentity carrying eight synthetic categorised metadatafields. Built by
examples/build_annotated_kfx_fixture.rsusing onlythe public
kfx::serializationhelpers — no real-book provenance.tests/kfx_annotated_metadata.rs— opens the fixture via the publicBook::openAPI and assertsmetadata().title,authors,publisher,language,identifier,dateall round-trip. Before this patch's fix,the test fails with
left: \"\"/right: \"Annotated Entity Test Book\"on the first assertion.
Verification
Reference
Amazon Ion specification on annotations — annotations are schema-level
metadata and do not change a value's logical shape:
https://amazon-ion.github.io/ion-docs/docs/spec.html#annot