Skip to content

isoverse/IsofileExtractor

Repository files navigation

isoextract-linux-x64 isoextract-linux-arm64 isoextract-osx-x64 isoextract-osx-arm64 isoextract-win-x64 isoextract-win-arm64 License: MIT

isoextract isoextract logo

A self-contained command-line tool for extracting data from stable isotope ratio mass spectrometry (IRMS) binary data files. Supports multiple vendor software formats. Each input file is parsed and the extracted data is written to a JSON output file in the same folder.

Supported file formats

Extension Measurement type Produced by Format docs
.dxf Continuous flow Thermo Fisher Isodat isodat_structure.md
.cf Continuous flow (legacy) Thermo Fisher Isodat isodat_structure.md
.bch Continuous flow SerCon Callisto bch_structure.md
.iarc Continuous flow / dual inlet Elementar IonOS (v2/v3) iarc_larc_structure.md
.larc Continuous flow / dual inlet Elementar LyticOS (v4) iarc_larc_structure.md
.imexp Continuous flow / dual inlet Thermo Fisher Qtegra imexp_structure.md
.did Dual inlet Thermo Fisher Isodat isodat_structure.md
.caf Dual inlet (legacy) Thermo Fisher Isodat isodat_structure.md
.scn Scan Thermo Fisher Isodat isodat_structure.md

.imexp files also need the isosolfs helper — a separate, closed-source binary that unpacks the proprietary SolFS container. It is not bundled with isoextract; download the isosolfs-<rid> matching your platform from the isosolfs release and place it next to the isoextract binary (or point to it with --isosolfs-path). All other formats work with isoextract alone. There is no isosolfs build for linux-arm64.

Usage

isoextract <file|dir> [...] [options]

One or more files or directories can be provided. Directories are searched recursively for files with supported extensions. Files are processed in parallel.

Options

Option Description
--version, -v Print the version and exit
--prettyJSON Pretty-print JSON output (number arrays are kept on one line)
--log [path] Write a CSV summary of all processed files. Defaults to isoextract.log in the current directory; an explicit path can be provided: --log results/run.log
--file-list <path> Read additional file/directory paths from a text file (one per line; lines starting with # are ignored)
--dry-run Parse files without writing any JSON or issues log output. The --log CSV is still written normally, making --dry-run --log the recommended way to evaluate a batch before committing to a full run

Recommended workflow: run with --dry-run --log first to parse all files and capture any warnings or errors in the log, without touching the output .json or .issues.log files. Once satisfied, re-run without --dry-run to write the final output.

Advanced options

Option Description
--unabridged Include verbose fields normally omitted: schema version numbers, app IDs, raw flags, etc.
--isosolfs-path <dir> (.imexp only) Directory holding the isosolfs helper. isoextract uses the binary matching its own architecture (isosolfs-<rid>, e.g. isosolfs-osx-x64). Defaults to isoextract's own directory; the helper is a separate download, so put isosolfs-<rid> there or pass this option
--full (.imexp only) Unpack the entire notebook with isosolfs instead of only the files isoextract reads. Has no effect on the JSON output; useful with --keep-extracted to inspect the full notebook contents
--keep-extracted (.imexp only) Keep the folder isosolfs unpacks next to the notebook instead of deleting it after reading
--objects (Isodat only) Write a .objects.csv output file for each input file, listing every deserialized C++ object with its byte offset, class name, schema version, and parent–child relationships
--tree (Isodat only) Write a .tree.txt output file for each input file showing the object hierarchy as an indented tree

Exit code

0 if all files were processed without errors, 1 if any file failed or was not found.

Output files

For each input file foo.dxf the following files are written:

File Always? Description
foo.dxf.json yes Extracted data
foo.dxf.issues.log only on warnings/errors Plain-text list of warnings and the error message (if any)
foo.dxf.objects.csv with --objects (Isodat only) Per-object log (offset, class, version, hierarchy)
foo.dxf.tree.txt with --tree (Isodat only) Indented object tree

JSON structure

Every output file has a meta block at the top:

{
  "meta": {
    "isoextract_version": "0.1.0.0",
    "file_type": "dxf",
    "file_size_bytes": 123456,
    "complete": true
  },
  ...
}

complete: false means parsing stopped early due to an error; the rest of the JSON contains whatever was extracted before the failure.

Log file format

The CSV written by --log has one row per file:

file,success,duration_ms,error
"data/example.dxf",true,134,
"data/broken.dxf",false,12,"No reader registered for class 'CUnknown'"

Examples

Process a single file:

isoextract sample.dxf

Process all files in a directory tree, pretty-printing the JSON:

isoextract /data/irms --prettyJSON

Process a directory and write a log:

isoextract /data/irms --log run.log

Process a hand-picked list:

isoextract --file-list batch.txt

Full diagnostic output (objects + tree + log):

isoextract /data/irms --objects --tree --prettyJSON --log

Building from source

Requires the .NET 8 SDK.

# Development build (produces bin/release/isoextract.dll, run with dotnet)
make build

# Self-contained, single-file binaries for every runtime via Docker, into
# dist/isoextract-<rid>[.exe] (linux-x64, linux-arm64, osx-x64, osx-arm64, win-x64, win-arm64)
make build-all

# ...or just the current OS/arch
make build-docker

make build-all produces the isoextract binaries only — the isosolfs helper is released separately (see the .imexp note) and is not bundled.

Live-reload during development (rebuilds and reruns on every save):

make dev

Run the test suite (tests/data against the bundled fixtures; .imexp tests use the local isosolfs helper in assets/isosolfs/):

make test

License

isoextract is released under the MIT License — see LICENSE.

.imexp extraction relies on a separate, proprietary isosolfs helper (built on Callback Technologies' CBFS Vault). It is distributed in its own release and is not covered by the MIT License; see THIRD-PARTY-NOTICES.

isoverse isoverse logo

This program is part of the isoverse suite of data tools for stable isotopes. If you like the functionality that isoverse packages provide, please help us spread the word and include an isoverse or individual package logo on one of your posters or slides. All logos are posted in high resolution in this repository. If you have suggestions for new features or other constructive feedback, please let us know on this short feeback form.

Funding NSF logo

This project is supported by a grant from the US National Science Foundation (EAR-2411458) to Sebastian Kopf.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages