Skip to content

fix: improvements#285

Closed
vinitkumar wants to merge 1 commit intomasterfrom
feat/benchmarl
Closed

fix: improvements#285
vinitkumar wants to merge 1 commit intomasterfrom
feat/benchmarl

Conversation

@vinitkumar
Copy link
Copy Markdown
Owner

@vinitkumar vinitkumar commented Apr 25, 2026

Summary by Sourcery

Improve JSON input readers and XML conversion behavior, expand tests for realistic HTTP and edge cases, and refresh documentation and benchmarks to reflect current performance and semantics.

New Features:

  • Add a shared JSONValue type alias and use it across reader and converter APIs for more accurate typing.

Bug Fixes:

  • Fix dicttoxml handling of @attrs/@Val so it no longer mutates caller data, correctly serializes None and boolean values, and avoids double-escaping invalid XML names or leaking @Flat suffixes into tag names.
  • Treat only None as missing data in Json2xml so falsy JSON values like empty containers, zero, false, and empty strings still serialize to XML.
  • Strengthen readfromurl to use a shared connection pool with timeouts and to wrap network, HTTP status, decoding, and JSON parse failures consistently in URLReadError.
  • Ensure the CLI consistently reports input and parsing errors via a single helper that exits with a non-zero status.

Enhancements:

  • Replace mocked URL tests with a tiny real HTTP server fixture to exercise readfromurl against realistic responses and network failures.
  • Align Rust and Python dicttoxml XML name normalization semantics to keep fast-path and pure-Python outputs in sync.
  • Clarify behavior and performance expectations in the architecture, behavior, tests, and benchmarks docs, including updated environment details and measurements.

Documentation:

  • Update benchmark documentation with current environment details, measured timings, and recommendations for Python, Rust, Go, and Zig usage.
  • Document input reader behavior, conversion rules for falsy JSON values, XML name normalization, and special attribute handling in the LAT docs.

Tests:

  • Add end-to-end tests for URL input using a real HTTP server and for network failure wrapping in URLReadError.
  • Extend dicttoxml tests to cover None and boolean @Val handling, non-mutating @attrs/@Val behavior, invalid XML name fallback escaping, @Flat suffix semantics, and Rust/Python XML name parity.
  • Add Json2xml tests to ensure falsy JSON values still produce XML output.

Chores:

  • Clean up pytest configuration by removing an unused xvs option.

@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai Bot commented Apr 25, 2026

Reviewer's Guide

Refines JSON input reading, URL error handling, and dict-to-XML conversion semantics (including falsy values and special @attrs/@Val behavior), aligns the Rust accelerator with Python name normalization, and tightens CLI/benchmark/docs behavior and types.

Sequence diagram for CLI input reading and unified error handling

sequenceDiagram
    actor User
    participant CLI
    participant Utils
    participant StdErr

    User->>CLI: invoke main with args
    CLI->>CLI: create_parser()
    CLI->>CLI: parse_args()
    CLI->>CLI: read_input(args)

    alt URL input
        CLI->>Utils: readfromurl(args.url)
        alt URL read succeeds
            Utils-->>CLI: JSONValue
        else URL read fails
            Utils-->>CLI: raise URLReadError
            CLI->>CLI: exit_with_error(message)
            CLI->>StdErr: print error message
            CLI-->>User: SystemExit(1)
        end
    else string input
        CLI->>Utils: readfromstring(args.string)
        alt string parse succeeds
            Utils-->>CLI: JSONValue
        else string parse fails
            Utils-->>CLI: raise StringReadError
            CLI->>CLI: exit_with_error(message)
            CLI->>StdErr: print error message
            CLI-->>User: SystemExit(1)
        end
    else file input
        CLI->>Utils: readfromjson(args.input_file)
        alt file read succeeds
            Utils-->>CLI: JSONValue
        else file read fails
            Utils-->>CLI: raise JSONReadError
            CLI->>CLI: exit_with_error(message)
            CLI->>StdErr: print error message
            CLI-->>User: SystemExit(1)
        end
    else stdin input
        CLI->>CLI: read_from_stdin()
        alt stdin has data
            CLI->>Utils: readfromstring(json_str)
            alt stdin JSON ok
                Utils-->>CLI: JSONValue
            else stdin JSON invalid
                Utils-->>CLI: raise StringReadError
                CLI->>CLI: exit_with_error(message)
                CLI->>StdErr: print error message
                CLI-->>User: SystemExit(1)
            end
        else stdin empty
            CLI->>CLI: exit_with_error("Error: Empty input")
            CLI->>StdErr: print error message
            CLI-->>User: SystemExit(1)
        end
    else no input provided
        CLI->>CLI: exit_with_error("No input provided")
        CLI->>StdErr: print error message
        CLI-->>User: SystemExit(1)
    end
Loading

Class diagram for JSONValue, input readers, serializer, and CLI

classDiagram
    class JSONValue {
        <<typealias>>
        None
        bool
        int
        float
        str
        list~JSONValue~
        dict~str, JSONValue~
    }

    class JSONReadError {
        +JSONReadError(message)
    }

    class URLReadError {
        +URLReadError(message)
    }

    class StringReadError {
        +StringReadError(message)
    }

    Exception <|-- JSONReadError
    Exception <|-- URLReadError
    Exception <|-- StringReadError

    class Utils {
        +readfromjson(filename str) JSONValue
        +readfromurl(url str, params dict~str,str~) JSONValue
        +readfromstring(jsondata object) JSONValue
    }

    class Json2xml {
        -data JSONValue
        -wrapper str
        -root bool
        -pretty bool
        -attr_type bool
        -item_wrap bool
        -cdata bool
        -list_headers bool
        +Json2xml(data JSONValue, wrapper str, root bool, pretty bool, attr_type bool, item_wrap bool, cdata bool, list_headers bool)
        +to_xml() Any
    }

    class DictToXmlEngine {
        +escape_xml(s str|int|float|numbers_Number|None) str
        +get_xml_type(val Any) str
        +make_valid_xml_name(key str, attr dict~str,Any~) tuple~str,dict~
        +dict2xml_str(item Any, item_name str, item_wrap bool, ids bool, attr_type bool, cdata bool, item_func function, list_headers bool) str
        +convert_dict(obj dict~str,Any~, parent str, ids bool, attr_type bool, cdata bool, item_func function, item_wrap bool, list_headers bool) str
    }

    class CLI {
        +exit_with_error(message str) NoReturn
        +create_parser() ArgumentParser
        +read_input(args Namespace) JSONValue
        +read_from_stdin() JSONValue
        +write_output(output str|bytes, output_file str) void
    }

    JSONValue <.. Utils : uses
    JSONValue <.. Json2xml : uses
    JSONValue <.. CLI : uses

    Utils ..> JSONReadError : raises
    Utils ..> URLReadError : raises
    Utils ..> StringReadError : raises

    Json2xml ..> DictToXmlEngine : calls dicttoxml
    CLI ..> Utils : reads JSON input
    CLI ..> Json2xml : converts JSON to XML
    CLI ..> DictToXmlEngine : via library API
Loading

File-Level Changes

Change Details Files
Strengthen JSON readers and CLI error handling with shared types and explicit failure wrapping.
  • Introduce a shared JSONValue type alias and use it in reader and converter APIs.
  • Update readfromjson/readfromstring to preserve original exceptions via chaining and broaden JSON value support.
  • Reimplement readfromurl around a shared urllib3 PoolManager with timeouts, network error wrapping, and stricter JSON/decoding handling.
  • Refactor CLI input reading to use JSONValue, add exit_with_error helper, and centralize stderr messaging and exit codes for URL/string/file/stdin errors.
  • Add stdin-empty handling via exit_with_error and keep read-from-URL behavior consistent with new URLReadError semantics.
json2xml/types.py
json2xml/utils.py
json2xml/cli.py
lat.md/behavior.md
lat.md/tests.md
tests/test_utils.py
Adjust dict-to-XML conversion semantics to better handle names, falsy values, special attributes, and @Flat keys while avoiding input mutation.
  • Extend escape_xml to accept None and adjust make_valid_xml_name to avoid premature escaping and double-escaping while normalizing invalid names into .
  • Ensure make_valid_xml_name treats numeric strings and space replacement consistently and removes implicit @Flat handling, delegating flattening to convert_dict.
  • Change dict2xml_str special @attrs/@Val handling to work on copies instead of mutating caller data and to treat None/@Val and bool values as empty/lowercase text respectively.
  • Update convert_dict to detect keys ending with @Flat, strip the suffix for element names while keeping it in list item_name for list headers, and keep IDs behavior intact.
  • Add tests to cover @Val None/bool behavior, non-mutation of special-attribute inputs, single-escape name fallback, @Flat key behavior, falsy JSON conversion through Json2xml, and Rust/Python XML-name parity.
json2xml/dicttoxml.py
json2xml/json2xml.py
tests/test_dict2xml.py
tests/test_json2xml.py
tests/test_rust_dicttoxml.py
lat.md/architecture.md
lat.md/behavior.md
lat.md/tests.md
Improve URL tests by exercising real HTTP behavior and aligning them with new URLReadError semantics.
  • Add a JsonTestHandler and json_server pytest fixture backed by ThreadingHTTPServer to serve realistic JSON, invalid JSON, and error responses.
  • Rewrite readfromurl tests to use the real HTTP server instead of mocking urllib3.PoolManager, covering success, params, HTTP error, server error, invalid JSON, and network failures wrapped as URLReadError.
  • Integrate the server fixture into URL-to-XML conversion tests to assert end-to-end behavior through dicttoxml.
tests/test_utils.py
Refresh benchmarks, narrative docs, and pytest config to match current behavior and performance.
  • Update BENCHMARKS.md with new machine/OS/Python details, precise dataset sizes, revised timing tables, speedup calculations, and recommendations for Python/Rust/Go/Zig tradeoffs.
  • Add a short Performance benchmarks section in architecture docs summarizing the April 2026 results and guidance.
  • Clarify behavior docs to describe new readfromurl error handling and Json2xml treatment of None vs other falsy values.
  • Remove obsolete xvs option from pytest configuration.
BENCHMARKS.md
lat.md/architecture.md
lat.md/behavior.md
pyproject.toml

Possibly linked issues

  • #Fix JSON conversion edge cases and URL reader robustness: The PR implements the issue’s JSON/XML edge-case fixes, URL reader robustness, typing, tests, and pytest config cleanup.

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 25, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 96.07%. Comparing base (351041d) to head (f9eb3b3).
⚠️ Report is 2 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #285      +/-   ##
==========================================
+ Coverage   95.93%   96.07%   +0.14%     
==========================================
  Files           5        6       +1     
  Lines         467      484      +17     
==========================================
+ Hits          448      465      +17     
  Misses         19       19              
Flag Coverage Δ
unittests 96.07% <100.00%> (+0.14%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Comment thread json2xml/cli.py

# @lat: [[behavior#Input readers]]
def read_input(args: argparse.Namespace) -> dict[str, Any] | list[Any]:
def read_input(args: argparse.Namespace) -> JSONValue:
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 3 issues

Prompt for AI Agents
Please address the comments from this code review:

## Individual Comments

### Comment 1
<location path="json2xml/dicttoxml.py" line_range="340-346" />
<code_context>
         attr["type"] = get_xml_type(item)
-    val_attr: dict[str, str] = item.pop("@attrs", attr)  # update attr with custom @attr if exists
-    rawitem = item["@val"] if "@val" in item else item
+    val_attr = dict(item["@attrs"]) if "@attrs" in item else dict(attr)
+    if "@val" in item:
+        rawitem = item["@val"]
+    elif "@attrs" in item:
+        rawitem = {key: value for key, value in item.items() if key != "@attrs"}
+    else:
+        rawitem = item
     if is_primitive_type(rawitem):
-        if isinstance(rawitem, dict):
</code_context>
<issue_to_address>
**issue (bug_risk):** Custom `@attrs` now overwrite, rather than extend, the auto-generated attributes (e.g. `type`).

The previous `item.pop("@attrs", attr)` merged custom attributes into `attr` (including the auto-generated `type`), whereas the new code replaces `attr` with `item["@attrs"]`, dropping defaults such as `type`. If you want `@attrs` to override specific keys while keeping auto-generated ones, you could merge instead:

```python
val_attr = dict(attr)
if "@attrs" in item:
    val_attr.update(item["@attrs"])
```

This retains the prior behavior while still allowing overrides.
</issue_to_address>

### Comment 2
<location path="tests/test_dict2xml.py" line_range="680-689" />
<code_context>
+    def test_dicttoxml_val_none_emits_empty_element(self) -> None:
</code_context>
<issue_to_address>
**suggestion (testing):** Add a test for objects with `@attrs` but without `@val` to cover the branch that rebuilds `rawitem` from non‑`@attrs` keys

You already cover `@val` being `None`, booleans, and the case where `@attrs` and `@val` both exist. There’s still a branch in `dict2xml_str` where `"@attrs" in item` but `"@val"` is missing, and `rawitem` is built from the remaining keys. A test like `{"node": {"@attrs": {"id": "1"}, "child": "value"}}` (with `root=False`, `attr_type=False`) would exercise that path, confirm the `id` attribute is applied, ensure `child` renders as a nested element, and verify the input dict is not mutated.
</issue_to_address>

### Comment 3
<location path="tests/test_rust_dicttoxml.py" line_range="364-361" />
<code_context>
         assert xmldata is None

+    # @lat: [[tests#Conversion behavior#Falsy JSON values convert to XML]]
+    @pytest.mark.parametrize(
+        ("data", "expected"),
+        [
</code_context>
<issue_to_address>
**suggestion (testing):** Consider adding Rust/Python parity cases for keys using the `@flat` suffix

Since `convert_dict` now treats `@flat` keys specially, please add a few such cases (both scalar and nested dicts) to the parity tests so Rust and Python behavior stays aligned, e.g. `{"name@flat": "Bike"}` and `{"item@flat": {"name": "Bike"}}`.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread json2xml/dicttoxml.py
Comment on lines +340 to +346
val_attr = dict(item["@attrs"]) if "@attrs" in item else dict(attr)
if "@val" in item:
rawitem = item["@val"]
elif "@attrs" in item:
rawitem = {key: value for key, value in item.items() if key != "@attrs"}
else:
rawitem = item
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Custom @attrs now overwrite, rather than extend, the auto-generated attributes (e.g. type).

The previous item.pop("@attrs", attr) merged custom attributes into attr (including the auto-generated type), whereas the new code replaces attr with item["@attrs"], dropping defaults such as type. If you want @attrs to override specific keys while keeping auto-generated ones, you could merge instead:

val_attr = dict(attr)
if "@attrs" in item:
    val_attr.update(item["@attrs"])

This retains the prior behavior while still allowing overrides.

Comment thread tests/test_dict2xml.py
Comment on lines +680 to +689
def test_dicttoxml_val_none_emits_empty_element(self) -> None:
"""Test @val=None serializes as empty text without leaking Python's repr."""
result = dicttoxml.dicttoxml(
{"field": {"@attrs": {"source": "api"}, "@val": None}},
root=False,
attr_type=False,
)

assert result == b'<field source="api"></field>'
assert b"None" not in result
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Add a test for objects with @attrs but without @val to cover the branch that rebuilds rawitem from non‑@attrs keys

You already cover @val being None, booleans, and the case where @attrs and @val both exist. There’s still a branch in dict2xml_str where "@attrs" in item but "@val" is missing, and rawitem is built from the remaining keys. A test like {"node": {"@attrs": {"id": "1"}, "child": "value"}} (with root=False, attr_type=False) would exercise that path, confirm the id attribute is applied, ensure child renders as a nested element, and verify the input dict is not mutated.

@@ -360,6 +360,27 @@ def test_very_large_integer_matches(self):
rust, python = self.compare_outputs(data)
assert rust == python
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Consider adding Rust/Python parity cases for keys using the @flat suffix

Since convert_dict now treats @flat keys specially, please add a few such cases (both scalar and nested dicts) to the parity tests so Rust and Python behavior stays aligned, e.g. {"name@flat": "Bike"} and {"item@flat": {"name": "Bike"}}.

@vinitkumar vinitkumar closed this Apr 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants