Disassemble large XML files into smaller files and reassemble the original XML. Preserves the XML declaration, root namespace, and element order so that a full round-trip (disassemble → reassemble) reproduces the original file contents.
Note: This is a Rust implementation of the original TypeScript xml-disassembler.
- Quick start
- Features
- Installation
- Usage
- Disassembly strategies
- Ignore file
- Logging
- XML parser
- Testing
- License
- Contribution
# Disassemble: one XML → many small files
xml-disassembler disassemble path/to/YourFile.permissionset-meta.xml \
--unique-id-elements "application,apexClass,name,flow,object,recordType,tab,field" \
--format json \
--strategy unique-id
# Reassemble: many small files → one XML
xml-disassembler reassemble path/to/YourFile permissionset-meta.xml- Disassemble – Split a single XML file (or directory of XML files) into many smaller files, grouped by structure.
- Reassemble – Merge disassembled files back into the original XML. Uses the XML declaration and root attributes from the disassembled files, with sensible defaults when missing.
- Multiple formats – Output (and reassemble from) XML, INI, JSON, JSON5, TOML, or YAML.
- Strategies –
unique-id(one file per nested element) orgrouped-by-tag(one file per tag). - Ignore rules – Exclude paths via a
.xmldisassemblerignorefile (same style as.gitignore). - Round-trip safe – Disassembled output includes the original XML declaration and
xmlnson the root; reassembly preserves order and content so the result matches the source. - Library API – Use
DisassembleXmlFileHandler,ReassembleXmlFileHandler,parse_xml, andbuild_xml_stringfrom your own Rust code.
- Install the Rust toolchain (rust-lang.org/tools/install).
- Run:
cargo install xml-disassembler
git clone https://github.com/mcarvin8/xml-disassembler-rust.git
cd xml-disassembler-rust
cargo build --releaseThe binary will be at target/release/xml-disassembler (or xml-disassembler.exe on Windows).
# Disassemble an XML file or directory (output written alongside the source)
xml-disassembler disassemble <path> [options]
# Reassemble a disassembled directory (writes one XML file next to the directory)
xml-disassembler reassemble <path> [extension] [--postpurge]
# Parse and rebuild a single XML file (useful for testing the parser)
xml-disassembler parse <path>| Option | Description | Default |
|---|---|---|
--unique-id-elements <list> |
Comma-separated element names used to derive filenames for nested elements | (none) |
--prepurge |
Remove existing disassembly output before running | false |
--postpurge |
Delete original file/directory after disassembling | false |
--ignore-path <path> |
Path to the ignore file | .xmldisassemblerignore |
--format <fmt> |
Output format: xml, ini, json, json5, toml, yaml | xml |
--strategy <name> |
unique-id or grouped-by-tag | unique-id |
| Option | Description | Default |
|---|---|---|
<extension> |
File extension/suffix for the rebuilt XML (e.g. permissionset-meta.xml) | xml |
--postpurge |
Delete disassembled directory after successful reassembly | false |
Examples:
xml-disassembler disassemble fixtures/general/HR_Admin.permissionset-meta.xml
# Creates fixtures/general/HR_Admin/ with disassembled files
xml-disassembler disassemble ./my.xml --format yaml --strategy grouped-by-tag --prepurge
xml-disassembler disassemble ./my.xml --unique-id-elements "name,id" --postpurge
xml-disassembler reassemble fixtures/general/HR_Admin
# Creates fixtures/general/HR_Admin.xml
xml-disassembler reassemble fixtures/general/HR_Admin permissionset-meta.xml --postpurgeuse xml_disassembler::{DisassembleXmlFileHandler, ReassembleXmlFileHandler, parse_xml, build_xml_string};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error + Send + Sync>> {
// Disassemble
let mut disassemble = DisassembleXmlFileHandler::new();
disassemble
.disassemble("path/to/file.xml", None, Some("unique-id"), false, false, ".xmldisassemblerignore", "xml")
.await?;
// Reassemble
let reassemble = ReassembleXmlFileHandler::new();
reassemble.reassemble("path/to/disassembled_dir", Some("xml"), false).await?;
// Parse and rebuild a single file
if let Some(parsed) = parse_xml("path/to/file.xml").await {
let xml = build_xml_string(&parsed);
println!("{}", xml);
}
Ok(())
}Each nested element is written to its own file, named by a unique identifier (or an 8-character SHA-256 hash if no UID is available). Leaf content stays in a file named after the original XML.
Best for fine-grained diffs and version control.
- UID-based layout – When you provide
--unique-id-elements(e.g.name,id,apexClass), nested elements are named by the first matching field value. For Salesforce flows, a typical list might be:apexClass,name,object,field,layout,actionName,targetReference,assignToReference,choiceText,promptText. Using unique ID elements also ensures predictable sorting in the reassembled output. - Hash-based layout – When no unique ID is found, elements are named with an 8-character hash of their content (e.g.
419e0199.botMlDomain-meta.xml).
All nested elements with the same tag go into one file per tag. Leaf content stays in the base file named after the original XML.
Best for fewer files and quick inspection.
xml-disassembler disassemble ./my.xml --strategy grouped-by-tag --format yamlReassembly preserves element content and structure; element order may differ (especially with TOML).
Exclude files or directories from disassembly using an ignore file (default: .xmldisassemblerignore). The Rust implementation uses the ignore crate with .gitignore-style syntax.
Place the file in the directory you run disassembly from (or specify a path with --ignore-path).
Example .xmldisassemblerignore:
# Skip these paths
**/secret.xml
**/generated/
Logging uses the log crate with env_logger. Control verbosity via the RUST_LOG environment variable.
# Default: only errors
xml-disassembler disassemble ./my.xml
# Verbose logging (debug level)
RUST_LOG=debug xml-disassembler disassemble ./my.xml
# Log only xml_disassembler crate
RUST_LOG=xml_disassembler=debug xml-disassembler disassemble ./my.xmlWhen using the library, call env_logger::init() early in your binary (as in the CLI) and set RUST_LOG as needed.
Parsing is done with quick-xml, with support for:
- CDATA – Preserved and output as
#cdatain the parsed structure. - Comments – Preserved in the XML output.
- Attributes – Stored with
@prefix (e.g.@version,@encoding).
Run all tests:
cargo test- Unit tests – In-module tests for parsers, builders, and merge logic (e.g.
strip_whitespace,merge_xml_elements,extract_root_attributes,parse_xml). - Integration test –
tests/disassemble_reassemble.rsruns a full round-trip: disassemble a fixture XML, reassemble it, and assert the reassembled content equals the original file.
Licensed under MIT.
See CONTRIBUTING.md.