Skip to content

crowlogic/arb4j

Repository files navigation

arb4j Overview

What is arb4j?

arb4j is a robust Java API designed to efficiently represent mathematical structures in their most general forms, prioritizing high performance. It seamlessly integrates with the arblib library via an interface generated by SWIG, enabling arbitrary precision real and complex ball arithmetic operations.

What is the Expression Compiler in arb4j?

arb4j's Expression Compiler is an eloquent fucking thing indeed. Here's why:

Direct Transmutation: Math → Bytecode

Self-Aware AST

Each Node subclass—DerivativeNode, AdditionNode, ExponentiationNode, etc.—knows how to:

  • Parse itself from the input string
  • Generate its own bytecode contribution
  • Differentiate itself symbolically (product rule, chain rule, power rule—all dispatched polymorphically)
  • Simplify itself (constant folding, identity elimination)

The AST isn't a passive data structure. It's a self-rewriting, self-compiling organism where a DerivativeNode rewrites its child subtree on the fly via recursive differentiate() calls before emitting code.

Type-Safe Generics Architecture

Expression<D, C, F> parameterizes domain, codomain, and function type, so Expression<Integer, RealPolynomial, RealPolynomialSequence> carries full compile-time proof that the generated class maps integers to real polynomials and implements the correct sequence interface. This isn't decorative—it drives the bytecode generator's field declarations, method signatures, and numerical operation dispatch.

Context and Dependency Resolution

The Context class provides a shared namespace where functions reference other functions, and a topological sort guarantees correct initialization order—critical for recursive definitions like Chebyshev where T references itself. Variable propagation flows from parent expressions into nested functionally-generated sub-expressions, maintaining referential coherence across compilation boundaries.

Forward Declarations and Mutual Recursion

A single self-recursive sequence (e.g. Chebyshev T(n) = 2x·T(n-1) - T(n-2)) needs no special handling: the body refers to its own name, the compiler’s self-reference guard short-circuits the recursive descent during code generation, and at run time the JVM resolves the class lazily on first evaluate call.

A cluster of two or more sequences that reference each other (mutual recursion) needs forward declaration. The pattern below compiles S and a where S(k) = Σ a(j)·a(k-1-j) and a(k) calls back into S(k):

// 1. Forward-declare `a` as a typed prototype. No expression body, no class
//    bytecode — just a FunctionMapping registered in the Context so that the
//    parser can resolve symbol `a` to a typed function reference.
context.registerFunctionMapping("a",
                                arb.Integer.class,
                                Complex.class,
                                ComplexSequence.class);

// 2. parseCompileAndRegister(S): parse S, generate class bytecode for S,
//    register the FunctionMapping. Crucially, do NOT instantiate S yet.
//    S’s bytecode references `a` by class-name STRING in the `new`/`<init>`
//    opcodes — the JVM does not eagerly resolve that string until the opcode
//    actually executes.
Sequence.parseCompileAndRegister("S", Complex.class,
                                 "S:k➔sum(j➔a(j)*a(k-1-j){j=1..k-2})",
                                 ComplexSequence.class, context);

// 3. express(a): parse `a`, generate bytecode for `a`, AND instantiate `a`.
//    Instantiation walks `a`’s public fields via reflection — which forces
//    the JVM to resolve every field type. By this point both class `S` and
//    class `a` are defined in the ExpressionClassLoader, so resolution
//    succeeds.
ComplexSequence a = ComplexSequence.express(
    "a:k➔when(k=1, p(v)/Γ(μ+1), else, γ_k*(q(v)*a(k-1) + r(v)*S(k)))",
    context);

// 4. Optional: instantiate S now that both classes exist. Calling express(S)
//    a second time, or evaluating any expression that references S, will
//    trigger this transparently.

Why the order matters

The critical invariant is: all classes in a recursive cluster must be defined in the ClassLoader before any of them is instantiated. Class definition (bytecode generation + ClassLoader.defineClass) does not force resolution of the classes named by new/<init> opcodes inside the body — those names are linked lazily on first execution. Instantiation, however, calls Context.injectVariableReferences and Context.injectFunctionReferences, both of which use Class.getFields() to enumerate the new instance’s public fields. The JVM reflection machinery resolves every field’s declared type at that point, and any reference to an undefined class throws NoClassDefFoundError.

The two roles of the API are:

  • parseCompileAndRegister(name, …, expression, …, context) — parse, emit bytecode, register the FunctionMapping. Does not instantiate. Use this to define every member of a recursive cluster except the last.
  • express(name, …, expression, …, context) — calls parseCompileAndRegister then instantiates and injects references. Use this for the last (or only) member, after all peers are defined.

Why the compile chain itself doesn’t loop

During code generation for a sequence whose body references another (yet-undefined) function, Expression.constructReferencedFunctionInstanceIfItIsNull emits the new/<init> opcodes using mapping.functionName as the class-name string and mapping.functionFieldDescriptor() (which falls back to L<functionName>; for un-instantiated mappings) as the field descriptor. No Class<?> lookup, no reflective instantiate() call — nothing in the compile path attempts to load the referenced class. The self-reference guard at the top of that method handles the same-name case (e.g. a referring to a inside when/else). Cross-references (aS, Sa) traverse ordinary FunctionMapping lookup which only reads the registered name and types.

When forward declaration is not needed

  • Single self-recursive sequences: Sequence.express("T:n➔when(n=0,1, n=1,x, else, 2*x*T(n-1) - T(n-2))", context) works as-is.
  • Topologically ordered references: if g calls f but f does not call g, just compile f first via plain express, then g.
  • Self-references inside sum/product over a recursion-free index, where the body of the index function is independent of the outer recursive symbol.

Unicode-Native Parser

The recursive descent parser natively consumes ×, ÷, , superscript exponents (x^2), combining diacritical marks (θ̇, θ̈ for first/second derivatives), , , , and . Mathematical notation goes in; bytecode comes out. No intermediate representation language, no DSL friction.

Resource-Managed Code Generation

Generated classes implement AutoCloseable, with a compiler-emitted close() method that disposes all intermediate variables and constants—no memory leaks from arbitrary-precision temporaries. The ASM library handles stack map frame computation, ensuring JVM-verifiable bytecode without manual frame calculation.

The Synthesis

What makes this thing eloquent is the unification: a single string of mathematical notation traverses lexical analysis → recursive descent parsing → polymorphic AST construction → symbolic simplification/differentiation → direct JVM bytecode emission → custom classloading → instantiation as a type-safe, closeable, LaTeX-renderable, self-documenting mathematical function. Every layer is load-bearing. No ceremony, no boilerplate, no interpretation overhead. The abstraction gap between the mathematician's notation and the machine's execution is collapsed to zero.

Features and Usage Patterns

Fluent Interface Pattern

  • arb4j employs a fluent pattern wherever possible, enhancing the way functions receive and return values.
  • The last argument in a function call becomes the return value, defaulting to 'this' if not specified.

Example:

Real x = new Real("25", 128); // 128 bits of precision

// Both lines achieve the same result:
Real five = x.sqrt(128);
Real five = x.sqrt(128, x); // Using 'this' as the result variable explicitly
  • To prevent overwriting the input variable:
Real five = x.sqrt(128, new Real());
  • Chain function calls in an object-oriented way:
Real y = new Real("25", 128)
            .add(RealConstants.one, 128)
            .log(128)
            .tanh(128);

Resource Management with AutoCloseable

  • The AutoCloseable interface is used for memory management.
  • This implementation ensures that objects can and should be used within try-with-resources blocks for optimal resource handling, especially important for managing native resources.

Example:

try (Real x = new Real("25", 128)) {
    doSomething(x);
} // x is automatically closed, ensuring proper resource management

Advanced Tools

Expression Compiler

  • The arb.expressions package in arb4j includes tools for compiling mathematical expressions directly into Java bytecode, saving milleniums of development time, reducing the need to laborously and tediously write new code for each different formula to be evaluated whilst also ensuring efficiency and correctness; it would be challenging to write code manually that would significantly outperform the generated code
Expressor

The Expressor program provides a tree-list view that shows the abstract-syntex-tree that constitutes a given expression and the intermediate values that combine to produce a given result.

Screenshot from 2024-08-25 21-42-44

Error Messages Produced By Expression Parser
Example: unmatched parenthesis
arb.exceptions.CompilerException: unexpected ')'(0x29) character at position=11 in expression '(1/2)-(z/2))^n' of length 14, remaining=)^n

	at arb4j/arb.expressions.Expression.throwNewUnexpectedCharacterException(Expression.java:1933)
	at arb4j/arb.expressions.Expression.parseRoot(Expression.java:1586)
	at arb4j/arb.functions.Function.parse(Function.java:381)
	at arb4j/arb.expressions.Compiler.compile(Compiler.java:161)
	at arb4j/arb.expressions.Compiler.express(Compiler.java:246)
	at arb4j/arb.expressions.Compiler.express(Compiler.java:222)
	at arb4j/arb.expressions.Compiler.compile(Compiler.java:127)
	at arb4j/arb.functions.Function.instantiate(Function.java:413)
	at arb4j/arb.functions.Function.express(Function.java:159)
	at arb4j/arb.functions.sequences.RationalFunctionSequence.express(RationalFunctionSequence.java:35)
	at arb4j/arb.functions.sequences.RationalFunctionSequence.express(RationalFunctionSequence.java:25)
	at arb4j/arb.RationalFunctionTest.testPowers(RationalFunctionTest.java:49)

which was generated because of the buggy test

  public void testPowers()
  {
    try ( Integer n = Integer.named("n").set(0))
    {
      Context          context            = new Context(n);
      var              rationalFunctional = RationalFunctionSequence.express("(1/2)-(z/2))^n", context);
      RationalFunction expressed          = rationalFunctional.evaluate(n, 128, new RationalFunction());
      assertEquals("x", expressed.toString());
    }
  }
Easily Decompilable Code
Example: Chebyshev Polynomials of the First Kind

The unmodified decompiled code generated by the ChebyshevPolynomialsOfTheFirstKind class

import arb.Initializable;
import arb.Integer;
import arb.RealPolynomial;
import arb.Typesettable;
import arb.functions.integer.RealPolynomialSequence;

public class T implements RealPolynomialSequence, Typesettable, AutoCloseable, Initializable {
   public boolean isInitialized;
   public final Integer cℤ2 = new Integer("1");
   public final Integer cℤ1 = new Integer("0");
   public final Integer cℤ3 = new Integer("2");
   public T T;
   public RealPolynomial Xℝ6 = new RealPolynomial();
   public RealPolynomial Xℝ5 = new RealPolynomial();
   public RealPolynomial Xℝ2 = new RealPolynomial();
   public RealPolynomial Xℝ1 = new RealPolynomial();
   public Integer ℤ1 = new Integer();
   public RealPolynomial Xℝ4 = new RealPolynomial();
   public Integer ℤ2 = new Integer();
   public RealPolynomial Xℝ3 = new RealPolynomial();

   @Override
   public Class<RealPolynomial> coDomainType() {
      return RealPolynomial.class;
   }

   @Override
   public RealPolynomial evaluate(Integer n, int order, int bits, RealPolynomial result) {
      if (!this.isInitialized) {
         this.initialize();
      }
      return switch(n.getSignedValue()) {
         case 0 -> result.set(this.Xℝ1.set(this.cℤ2));
         case 1 -> result.set(result.identity());
         default -> this.cℤ3
         .mul(this.Xℝ2.identity(), bits, this.Xℝ3)
         .mul((RealPolynomial)this.T.evaluate(n.sub(this.cℤ2, bits, this.ℤ1), order, bits, this.Xℝ4), bits, this.Xℝ5)
         .sub((RealPolynomial)this.T.evaluate(n.sub(this.cℤ3, bits, this.ℤ2), order, bits, this.Xℝ6), bits, result);
      };
   }

   @Override
   public void initialize() {
      if (this.isInitialized) {
         throw new AssertionError("Already initialized");
      } else {
         this.T = new T();
         this.isInitialized = true;
      }
   }

   @Override
   public void close() {
      this.cℤ2.close();
      this.cℤ1.close();
      this.cℤ3.close();
      this.Xℝ6.close();
      this.Xℝ5.close();
      this.Xℝ2.close();
      this.Xℝ1.close();
      this.ℤ1.close();
      this.Xℝ4.close();
      this.ℤ2.close();
      this.Xℝ3.close();
      this.T.close();
   }

   @Override
   public String toString() {
      return "T:n➔when(n=0,1,n=1,x,else,2*x*T(n-1)-T(n-2))";
   }

   @Override
   public String typeset() {
      return "1, x \text{otherwise} \\left(2 \\cdot x \\cdot \\T(\\left(n-1\\right))-\\T(\\left(n-2\\right))\\right)";
   }
}

(Symbolic, Compiled, Automatic) Differentiation and Integration

A Note Regarding Fonts

It is recommended to use the "Noto Sans Mono" or "PT Mono" truetype fonts since it is one of the very few monospace fonts that correctly renders combining diacritics on Linux—including dot above (◌̇) and diaeresis (◌̈)—directly above Greek (and other) characters. Unlike most monospace fonts, Noto Sans Mono is built by Google with a specific goal of full Unicode coverage and accurate OpenType mark positioning, which makes it uniquely suited for this purpose. For instance, so that derivatives can be expressed as demonstrated by this snippet

  public void testSecondDerivativeViaCombiningTwoDotsAboveCharacter()
  {
    Expression.saveClasses = true;
    final Context      context = new Context();
    final RealFunction θ       = RealFunction.express("θ:im(lnΓ(¼+ⅈ*t/2))-(log(π)/2)*t", context);
    final RealFunction Nθ̇     = RealFunction.express("Nθ̇:t➔t-θ̇(t)/θ̈(t)", context);
    var                y       = Nθ̇.eval(2.3);
    assertFalse(Double.isNaN(y));
  }

Forked modularized version of jlatexmath

See this for a version of jlatexmath without the unnamed module warnings

License

arb4j is made available under the terms of the Business Source License™ v1.1 which can be found in the root directory of this project in a file named License.pdf, License.txt, or License.tm which are the pdf, text, and TeXmacs formats of the same document respectively.


Build Notes

Prerequisites

sudo apt-get update
sudo apt-get install -y openjdk-26-jdk-headless maven clang swig libflint-dev libxdo-dev

Java 26 is required. Set:

export JAVA_HOME=/usr/lib/jvm/java-26-openjdk-amd64

FLINT 3.3+ is required. The SWIG interface files target the FLINT 3.3 API. FLINT 3.0–3.2 temporarily renamed flint_rand_structflint_rand_s, flint_rand_initflint_randinit, etc., and removed the stride field from arb_mat_struct/acb_mat_struct. FLINT 3.3 reverted all of these back to the original names. If your distro ships FLINT 3.1 or 3.2 (e.g., Debian trixie ships 3.1.3), you need to either install FLINT 3.3+ from source or apply the workarounds in the FLINT 3.1/3.2 section below.

UTF-8 locale — source files use Unicode in filenames (σField.java, RiemannζFunction.java, RiemannξFunction.java). Without a UTF-8 locale, javac fails with Invalid filename: ??Field.java.

sudo apt-get install -y locales
sudo sed -i '/en_US.UTF-8/s/^# //g' /etc/locale.gen
sudo locale-gen en_US.UTF-8
export LANG=en_US.UTF-8
export LC_ALL=en_US.UTF-8

Build

export JAVA_HOME=/usr/lib/jvm/java-26-openjdk-amd64 LANG=en_US.UTF-8 LC_ALL=en_US.UTF-8
make clean && make
mvn test

libarblib.so is built into the project root. The Maven surefire plugin sets java.library.path to ${project.basedir} automatically.

FLINT 3.1/3.2 Workarounds

If you are stuck on FLINT 3.1 or 3.2 and cannot upgrade to 3.3+, the following workarounds are needed after swig generates arb_wrap.c:

# Patch out stride field access (removed in 3.1, restored in 3.3)
sed -i 's/if (arg1) (arg1)->stride = arg2;/\/\/ stride removed in FLINT 3.1-3.2/' native/arb_wrap.c
sed -i 's/result = (long) ((arg1)->stride);/result = 0; \/\/ stride removed in FLINT 3.1-3.2/' native/arb_wrap.c

# Compile with name remapping defines
clang -g -O3 -fPIC -shared -Wno-int-conversion \
    -Dflint_rand_struct=flint_rand_s \
    -Dflint_rand_init=flint_randinit \
    -Dflint_rand_clear=flint_randclear \
    -Dflint_rand_set_seed=flint_randseed \
    native/arb_wrap.c native/complex.c native/ml.c \
    -I${JAVA_HOME}/include -I${JAVA_HOME}/include/linux -I/usr/include/flint \
    -olibarblib.so -lflint -lxdo

Troubleshooting

Symptom Fix
mvn: command not found sudo apt-get install -y maven
Invalid filename: ??Field.java Set LANG=en_US.UTF-8 LC_ALL=en_US.UTF-8 and generate the locale
flint_rand_struct undeclared You have FLINT 3.1/3.2 — upgrade to 3.3+ or apply the workarounds above
no member named 'stride' Same — FLINT 3.1/3.2 issue
java.lang.UnsatisfiedLinkError: no arblib libarblib.so is missing from project root — run make
release version 26 not supported export JAVA_HOME=/usr/lib/jvm/java-26-openjdk-amd64

About

arb4j is a high-performance Java API for arbitrary-precision real&complex ball arithmetic operations based on the ARB and FLINT libraries utilizing SWIG-generated interfaces and featuring a sophisticated ObjectWeb ASM based UTF String->JVM bytecode compiler, automatic differentiation and integration capabilities, and a fluent API interface pattern

Topics

Resources

License

Stars

Watchers

Forks

Contributors

Languages