Skip to content

MoSafi2/mojo_bindgen

Repository files navigation

CI

mojo-bindgen

Warning

Alpha stage: this project is under heavy development and may change quickly.

C headers -> Mojo FFI. mojo-bindgen parses real C with libclang, and emits Mojo bindings for external_call or owned_dl_handle workflows. this mirrors the spirit of rust-bindgen which follows the same approch for Rust

The goal is simple: make binding generation easy and faithful as possible to the actual C surface, and fail conservatively when a declaration cannot be modeled correctly.

Requirements

  • Python 3.14+
  • a system libclang compatible with the libclang Python wheel
  • a Mojo (nightly) toolchain if you want to build or run the generated bindings

Installation

System dependencies

Install Clang and the shared libclang library first:

# Ubuntu / Debian
sudo apt update && sudo apt install -y clang libclang1

# Fedora
sudo dnf install -y clang llvm-libs

# macOS (Homebrew)
brew install llvm

If the shared library is not on the default loader path, set LIBCLANG_PATH to the directory containing libclang.so or libclang.dylib.

Install from PyPI

pip install mojo-bindgen

PyPI package: mojo-bindgen

Install from source

git clone https://github.com/MoSafi2/mojo_bindgen
cd mojo_bindgen
pip install -e .

For development setup, checks, and Pixi workflows, see CONTRIBUTING.md.

Quick start

Generate bindings from a primary header:

mojo-bindgen path/to/header.h -o bindings.mojo --link-mode external-call

Pass common Clang inputs with structured flags:

mojo-bindgen include/mylib.h \
  -I ./include \
  -D MYLIB_FEATURE=1 \
  --std c11 \
  -o mylib_bindings.mojo

Add sibling public headers with repeated --public-header. The primary header and public headers are included in an internal umbrella header. Transitive #include files are parsed by Clang and can still contribute declarations because source-file filtering is not applied yet:

mojo-bindgen include/mylib.h \
  --public-header include/mylib_extra.h \
  -I ./include \
  -o mylib_bindings.mojo

By default the parser uses -std=gnu11 when no C standard is provided. Pin a standard explicitly if your header requires one:

mojo-bindgen include/mylib.h --std c99 -o mylib_bindings.mojo

Use --clang-arg for raw Clang flags that do not have a structured option:

mojo-bindgen include/mylib.h --clang-arg=-fms-extensions -o mylib_bindings.mojo

Linking modes

mojo-bindgen supports two output styles:

  • external_call Direct FFI wrappers. Use this when the target library is linked at Mojo build time.
  • owned_dl_handle Dynamic runtime symbol lookup via OwnedDLHandle for loading a shared library (.so, .dylib).

Examples:

# default
mojo-bindgen include/mylib.h --link-mode external-call -o mylib_bindings.mojo

# runtime-loaded shared library
mojo-bindgen include/mylib.h \
  --link-mode owned-dl-handle \
  --library-path /usr/lib/libmylib.so \
  -o mylib_bindings_dl.mojo

What works today?

mojo-bindgen is still alpha and evolves quickly, but it already supports a useful slice of real C headers and is practical today as a starting point for generating bindings.

Current support includes:

  • Parsing and mapping: real C parsing through libclang, repeatable --clang-arg support, and a structured IR pipeline rather than text-only generation.
  • Primitive types: scalar types, typedef chains, pointers with const-aware mutability, fixed arrays, incomplete-array decay cases, complex values, vector extension types, and representable atomics.
  • Mojo-native numeric mapping: vector types map to SIMD[...], complex values map to ComplexSIMD[...], and representable atomics map to Atomic[...].
  • Records: structs, anonymous members, mixed layouts that combine plain fields and bitfields, synthesized padding, and custom alignment emission where Mojo can represent the layout faithfully.
  • Bitfields: bitfields are emitted through explicit storage fields plus synthesized getter and setter methods.
  • Unions: eligible unions map to UnsafeUnion[...]; unions that cannot be represented safely fall back to opaque InlineArray[...] storage with diagnostics to preserve layout.
  • Opaque and difficult layouts: incomplete records, packed layouts, and alignment-sensitive record storages are preserved conservatively as opaque byte storage when a faithful typed layout is not possible.
  • Callbacks and function pointers: callback typedefs, function-pointer fields, and function-pointer parameters and returns are preserved in Mojo via emitted comptime callback declarations and synthesized aliases when needed.
  • Functions: thin wrappers are generated for non-variadic functions under both external_call and owned_dl_handle link modes.
  • Globals and constants: because Mojo does not currently expose native C globals directly, supported globals map through generated GlobalVar / GlobalConst helper structs with synthesized load() / store() methods; constants and supported object-like macros map to comptime declarations.
  • Macros: integer, float, string, and char literal macros, foldable macro chains, supported casts, and sizeof(type) expressions are emitted as Mojo code.
  • Debug IR output: the CLI has hidden maintainer flags for serialized parser and Mojo IR sidecars.

Current limitations

Known gaps you may still hit in generated code. For ABI-sensitive surfaces, verify emitted layouts and symbols against your target toolchain.

  • Macros: function-like macros, predefined macros, and more complex preprocessor behavior are preserved but usually emitted as comments for end-user review.
  • Variadics: variadic C functions are not wrapped as callable thin-FFI bindings yet and are emitted as comment stubs.
  • Non-prototype / K&R-style functions: older C declaration styles are only partially modeled and should be treated with caution.
  • Records with hostile layouts: some packed, ABI-sensitive, or otherwise difficult record layouts cannot be emitted as fully typed Mojo structs and fall back to opaque storage; layout-sensitive declarations may still require manual verification.
  • Anonymous members: anonymous struct and union members are preserved structurally, but they are not automatically promoted into a flattened parent record surface.
  • Atomics: atomic support is conservative. Representable atomic fields and pointer-based usage work, but atomic globals are still emitted as stubs and some surfaces require manual handling.
  • Linkage and compiler edge cases: inline, compiler-specific linkage hints, and other extension-heavy cases can still require manual review and may lead to symbol mismatches at runtime.
  • Public-header model: the primary header and any --public-header values are included in an internal umbrella header. Transitive #include files are also visible to Clang and may contribute declarations because source-file filtering is not applied yet.

Real-world examples

The repository includes worked examples and smoke programs for:

These examples do more than generate bindings: their generate.sh scripts also build smoke artifacts or run small functional tests to check the usability of the generated bindings. They pass --layout-tests explicitly when they need layout-test sidecars.

The test suite also has end-to-end runtime coverage for:

  • by-value records and enums
  • callbacks and function-pointer returns
  • globals and constants
  • vectors and complex values
  • atomic pointer-based APIs
  • opaque forward declarations
  • pointer-to-array and array-decay cases
  • both external_call and owned_dl_handle link modes

See tests/e2e/README.md for the current runtime case matrix.

Troubleshooting

The generated module is empty or missing declarations

mojo-bindgen parses the primary header plus any headers listed with --public-header through an internal umbrella header. If a thin wrapper only includes another header whose declarations you care about, pass that included header with --public-header or use it as the primary header directly. Normal transitive #include files are also visible to Clang and may appear in output.

Parsing fails on project headers

Most parser failures are missing include paths, target flags, or defines. Add the same flags your C build uses with -I / --include, -D / --define, -U / --undefine, --target, --sysroot, --std, or repeated --clang-arg.

Debugging parser failures

Print the normalized Clang arguments that mojo-bindgen will use:

mojo-bindgen include/mylib.h -I ./include --print-clang-args

Write diagnostics as JSON while still generating normal output:

mojo-bindgen include/mylib.h \
  --diagnostics json \
  --diagnostics-output diagnostics.json \
  -o mylib_bindings.mojo

Dump the preprocessed input that Clang sees:

mojo-bindgen include/mylib.h \
  -I ./include \
  --dump-preprocessed mylib.preprocessed.c \
  -o mylib_bindings.mojo

Build succeeds but symbols are missing at runtime

Double-check:

  • --library and --link-name
  • your Mojo link flags for external_call
  • your --library-path for owned_dl_handle
  • whether the original C declaration involved tricky inline or exotic layout that needs manual review.

License

Licensed under the MIT License. See LICENSE.


Contributing: CONTRIBUTING.md.

About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors