arb4j is a robust Java API designed to efficiently represent mathematical structures in their most general forms, prioritizing high performance. It seamlessly integrates with the arblib library via an interface generated by SWIG, enabling arbitrary precision real and complex ball arithmetic operations.
arb4j's Expression Compiler is an eloquent fucking thing indeed. Here's why:
Each Node subclass—DerivativeNode, AdditionNode, ExponentiationNode, etc.—knows how to:
- Parse itself from the input string
- Generate its own bytecode contribution
- Differentiate itself symbolically (product rule, chain rule, power rule—all dispatched polymorphically)
- Simplify itself (constant folding, identity elimination)
The AST isn't a passive data structure. It's a self-rewriting, self-compiling organism where a DerivativeNode rewrites its child subtree on the fly via recursive differentiate() calls before emitting code.
Expression<D, C, F> parameterizes domain, codomain, and function type, so Expression<Integer, RealPolynomial, RealPolynomialSequence> carries full compile-time proof that the generated class maps integers to real polynomials and implements the correct sequence interface. This isn't decorative—it drives the bytecode generator's field declarations, method signatures, and numerical operation dispatch.
The Context class provides a shared namespace where functions reference other functions, and a topological sort guarantees correct initialization order—critical for recursive definitions like Chebyshev where T references itself. Variable propagation flows from parent expressions into nested functionally-generated sub-expressions, maintaining referential coherence across compilation boundaries.
A single self-recursive sequence (e.g. Chebyshev T(n) = 2x·T(n-1) - T(n-2)) needs no special handling: the body refers to its own name, the compiler’s self-reference guard short-circuits the recursive descent during code generation, and at run time the JVM resolves the class lazily on first evaluate call.
A cluster of two or more sequences that reference each other (mutual recursion) needs forward declaration. The pattern below compiles S and a where S(k) = Σ a(j)·a(k-1-j) and a(k) calls back into S(k):
// 1. Forward-declare `a` as a typed prototype. No expression body, no class
// bytecode — just a FunctionMapping registered in the Context so that the
// parser can resolve symbol `a` to a typed function reference.
context.registerFunctionMapping("a",
arb.Integer.class,
Complex.class,
ComplexSequence.class);
// 2. parseCompileAndRegister(S): parse S, generate class bytecode for S,
// register the FunctionMapping. Crucially, do NOT instantiate S yet.
// S’s bytecode references `a` by class-name STRING in the `new`/`<init>`
// opcodes — the JVM does not eagerly resolve that string until the opcode
// actually executes.
Sequence.parseCompileAndRegister("S", Complex.class,
"S:k➔sum(j➔a(j)*a(k-1-j){j=1..k-2})",
ComplexSequence.class, context);
// 3. express(a): parse `a`, generate bytecode for `a`, AND instantiate `a`.
// Instantiation walks `a`’s public fields via reflection — which forces
// the JVM to resolve every field type. By this point both class `S` and
// class `a` are defined in the ExpressionClassLoader, so resolution
// succeeds.
ComplexSequence a = ComplexSequence.express(
"a:k➔when(k=1, p(v)/Γ(μ+1), else, γ_k*(q(v)*a(k-1) + r(v)*S(k)))",
context);
// 4. Optional: instantiate S now that both classes exist. Calling express(S)
// a second time, or evaluating any expression that references S, will
// trigger this transparently.The critical invariant is: all classes in a recursive cluster must be defined in the ClassLoader before any of them is instantiated. Class definition (bytecode generation + ClassLoader.defineClass) does not force resolution of the classes named by new/<init> opcodes inside the body — those names are linked lazily on first execution. Instantiation, however, calls Context.injectVariableReferences and Context.injectFunctionReferences, both of which use Class.getFields() to enumerate the new instance’s public fields. The JVM reflection machinery resolves every field’s declared type at that point, and any reference to an undefined class throws NoClassDefFoundError.
The two roles of the API are:
parseCompileAndRegister(name, …, expression, …, context)— parse, emit bytecode, register theFunctionMapping. Does not instantiate. Use this to define every member of a recursive cluster except the last.express(name, …, expression, …, context)— callsparseCompileAndRegisterthen instantiates and injects references. Use this for the last (or only) member, after all peers are defined.
During code generation for a sequence whose body references another (yet-undefined) function, Expression.constructReferencedFunctionInstanceIfItIsNull emits the new/<init> opcodes using mapping.functionName as the class-name string and mapping.functionFieldDescriptor() (which falls back to L<functionName>; for un-instantiated mappings) as the field descriptor. No Class<?> lookup, no reflective instantiate() call — nothing in the compile path attempts to load the referenced class. The self-reference guard at the top of that method handles the same-name case (e.g. a referring to a inside when/else). Cross-references (a → S, S → a) traverse ordinary FunctionMapping lookup which only reads the registered name and types.
- Single self-recursive sequences:
Sequence.express("T:n➔when(n=0,1, n=1,x, else, 2*x*T(n-1) - T(n-2))", context)works as-is. - Topologically ordered references: if
gcallsfbutfdoes not callg, just compileffirst via plainexpress, theng. - Self-references inside
sum/productover a recursion-free index, where the body of the index function is independent of the outer recursive symbol.
The recursive descent parser natively consumes ×, ÷, ⁄, superscript exponents (x² → x^2), combining diacritical marks (θ̇, θ̈ for first/second derivatives), ∑, ∏, ∂, and ∫. Mathematical notation goes in; bytecode comes out. No intermediate representation language, no DSL friction.
Generated classes implement AutoCloseable, with a compiler-emitted close() method that disposes all intermediate variables and constants—no memory leaks from arbitrary-precision temporaries. The ASM library handles stack map frame computation, ensuring JVM-verifiable bytecode without manual frame calculation.
What makes this thing eloquent is the unification: a single string of mathematical notation traverses lexical analysis → recursive descent parsing → polymorphic AST construction → symbolic simplification/differentiation → direct JVM bytecode emission → custom classloading → instantiation as a type-safe, closeable, LaTeX-renderable, self-documenting mathematical function. Every layer is load-bearing. No ceremony, no boilerplate, no interpretation overhead. The abstraction gap between the mathematician's notation and the machine's execution is collapsed to zero.
- arb4j employs a fluent pattern wherever possible, enhancing the way functions receive and return values.
- The last argument in a function call becomes the return value, defaulting to
'this'if not specified.
Example:
Real x = new Real("25", 128); // 128 bits of precision
// Both lines achieve the same result:
Real five = x.sqrt(128);
Real five = x.sqrt(128, x); // Using 'this' as the result variable explicitly- To prevent overwriting the input variable:
Real five = x.sqrt(128, new Real());- Chain function calls in an object-oriented way:
Real y = new Real("25", 128)
.add(RealConstants.one, 128)
.log(128)
.tanh(128);- The
AutoCloseableinterface is used for memory management. - This implementation ensures that objects can and should be used within try-with-resources blocks for optimal resource handling, especially important for managing native resources.
Example:
try (Real x = new Real("25", 128)) {
doSomething(x);
} // x is automatically closed, ensuring proper resource management- The arb.expressions package in arb4j includes tools for compiling mathematical expressions directly into Java bytecode, saving milleniums of development time, reducing the need to laborously and tediously write new code for each different formula to be evaluated whilst also ensuring efficiency and correctness; it would be challenging to write code manually that would significantly outperform the generated code
The Expressor program provides a tree-list view that shows the abstract-syntex-tree that constitutes a given expression and the intermediate values that combine to produce a given result.
arb.exceptions.CompilerException: unexpected ')'(0x29) character at position=11 in expression '(1/2)-(z/2))^n' of length 14, remaining=)^n
at arb4j/arb.expressions.Expression.throwNewUnexpectedCharacterException(Expression.java:1933)
at arb4j/arb.expressions.Expression.parseRoot(Expression.java:1586)
at arb4j/arb.functions.Function.parse(Function.java:381)
at arb4j/arb.expressions.Compiler.compile(Compiler.java:161)
at arb4j/arb.expressions.Compiler.express(Compiler.java:246)
at arb4j/arb.expressions.Compiler.express(Compiler.java:222)
at arb4j/arb.expressions.Compiler.compile(Compiler.java:127)
at arb4j/arb.functions.Function.instantiate(Function.java:413)
at arb4j/arb.functions.Function.express(Function.java:159)
at arb4j/arb.functions.sequences.RationalFunctionSequence.express(RationalFunctionSequence.java:35)
at arb4j/arb.functions.sequences.RationalFunctionSequence.express(RationalFunctionSequence.java:25)
at arb4j/arb.RationalFunctionTest.testPowers(RationalFunctionTest.java:49)which was generated because of the buggy test
public void testPowers()
{
try ( Integer n = Integer.named("n").set(0))
{
Context context = new Context(n);
var rationalFunctional = RationalFunctionSequence.express("(1/2)-(z/2))^n", context);
RationalFunction expressed = rationalFunctional.evaluate(n, 128, new RationalFunction());
assertEquals("x", expressed.toString());
}
}The unmodified decompiled code generated by the ChebyshevPolynomialsOfTheFirstKind class
import arb.Initializable;
import arb.Integer;
import arb.RealPolynomial;
import arb.Typesettable;
import arb.functions.integer.RealPolynomialSequence;
public class T implements RealPolynomialSequence, Typesettable, AutoCloseable, Initializable {
public boolean isInitialized;
public final Integer cℤ2 = new Integer("1");
public final Integer cℤ1 = new Integer("0");
public final Integer cℤ3 = new Integer("2");
public T T;
public RealPolynomial Xℝ6 = new RealPolynomial();
public RealPolynomial Xℝ5 = new RealPolynomial();
public RealPolynomial Xℝ2 = new RealPolynomial();
public RealPolynomial Xℝ1 = new RealPolynomial();
public Integer ℤ1 = new Integer();
public RealPolynomial Xℝ4 = new RealPolynomial();
public Integer ℤ2 = new Integer();
public RealPolynomial Xℝ3 = new RealPolynomial();
@Override
public Class<RealPolynomial> coDomainType() {
return RealPolynomial.class;
}
@Override
public RealPolynomial evaluate(Integer n, int order, int bits, RealPolynomial result) {
if (!this.isInitialized) {
this.initialize();
}
return switch(n.getSignedValue()) {
case 0 -> result.set(this.Xℝ1.set(this.cℤ2));
case 1 -> result.set(result.identity());
default -> this.cℤ3
.mul(this.Xℝ2.identity(), bits, this.Xℝ3)
.mul((RealPolynomial)this.T.evaluate(n.sub(this.cℤ2, bits, this.ℤ1), order, bits, this.Xℝ4), bits, this.Xℝ5)
.sub((RealPolynomial)this.T.evaluate(n.sub(this.cℤ3, bits, this.ℤ2), order, bits, this.Xℝ6), bits, result);
};
}
@Override
public void initialize() {
if (this.isInitialized) {
throw new AssertionError("Already initialized");
} else {
this.T = new T();
this.isInitialized = true;
}
}
@Override
public void close() {
this.cℤ2.close();
this.cℤ1.close();
this.cℤ3.close();
this.Xℝ6.close();
this.Xℝ5.close();
this.Xℝ2.close();
this.Xℝ1.close();
this.ℤ1.close();
this.Xℝ4.close();
this.ℤ2.close();
this.Xℝ3.close();
this.T.close();
}
@Override
public String toString() {
return "T:n➔when(n=0,1,n=1,x,else,2*x*T(n-1)-T(n-2))";
}
@Override
public String typeset() {
return "1, x \text{otherwise} \\left(2 \\cdot x \\cdot \\T(\\left(n-1\\right))-\\T(\\left(n-2\\right))\\right)";
}
}- differentiation and integration progress can be tracked at: GitHub Issue #253.
It is recommended to use the "Noto Sans Mono" or "PT Mono" truetype fonts since it is one of the very few monospace fonts that correctly renders combining diacritics on Linux—including dot above (◌̇) and diaeresis (◌̈)—directly above Greek (and other) characters. Unlike most monospace fonts, Noto Sans Mono is built by Google with a specific goal of full Unicode coverage and accurate OpenType mark positioning, which makes it uniquely suited for this purpose. For instance, so that derivatives can be expressed as demonstrated by this snippet
public void testSecondDerivativeViaCombiningTwoDotsAboveCharacter()
{
Expression.saveClasses = true;
final Context context = new Context();
final RealFunction θ = RealFunction.express("θ:im(lnΓ(¼+ⅈ*t/2))-(log(π)/2)*t", context);
final RealFunction Nθ̇ = RealFunction.express("Nθ̇:t➔t-θ̇(t)/θ̈(t)", context);
var y = Nθ̇.eval(2.3);
assertFalse(Double.isNaN(y));
}See this for a version of jlatexmath without the unnamed module warnings
arb4j is made available under the terms of the Business Source License™ v1.1 which can be found in the root directory of this project in a file named License.pdf, License.txt, or License.tm which are the pdf, text, and TeXmacs formats of the same document respectively.
sudo apt-get update
sudo apt-get install -y openjdk-26-jdk-headless maven clang swig libflint-dev libxdo-devJava 26 is required. Set:
export JAVA_HOME=/usr/lib/jvm/java-26-openjdk-amd64FLINT 3.3+ is required. The SWIG interface files target the FLINT 3.3 API. FLINT 3.0–3.2 temporarily renamed flint_rand_struct → flint_rand_s, flint_rand_init → flint_randinit, etc., and removed the stride field from arb_mat_struct/acb_mat_struct. FLINT 3.3 reverted all of these back to the original names. If your distro ships FLINT 3.1 or 3.2 (e.g., Debian trixie ships 3.1.3), you need to either install FLINT 3.3+ from source or apply the workarounds in the FLINT 3.1/3.2 section below.
UTF-8 locale — source files use Unicode in filenames (σField.java, RiemannζFunction.java, RiemannξFunction.java). Without a UTF-8 locale, javac fails with Invalid filename: ??Field.java.
sudo apt-get install -y locales
sudo sed -i '/en_US.UTF-8/s/^# //g' /etc/locale.gen
sudo locale-gen en_US.UTF-8
export LANG=en_US.UTF-8
export LC_ALL=en_US.UTF-8export JAVA_HOME=/usr/lib/jvm/java-26-openjdk-amd64 LANG=en_US.UTF-8 LC_ALL=en_US.UTF-8
make clean && make
mvn testlibarblib.so is built into the project root. The Maven surefire plugin sets java.library.path to ${project.basedir} automatically.
If you are stuck on FLINT 3.1 or 3.2 and cannot upgrade to 3.3+, the following workarounds are needed after swig generates arb_wrap.c:
# Patch out stride field access (removed in 3.1, restored in 3.3)
sed -i 's/if (arg1) (arg1)->stride = arg2;/\/\/ stride removed in FLINT 3.1-3.2/' native/arb_wrap.c
sed -i 's/result = (long) ((arg1)->stride);/result = 0; \/\/ stride removed in FLINT 3.1-3.2/' native/arb_wrap.c
# Compile with name remapping defines
clang -g -O3 -fPIC -shared -Wno-int-conversion \
-Dflint_rand_struct=flint_rand_s \
-Dflint_rand_init=flint_randinit \
-Dflint_rand_clear=flint_randclear \
-Dflint_rand_set_seed=flint_randseed \
native/arb_wrap.c native/complex.c native/ml.c \
-I${JAVA_HOME}/include -I${JAVA_HOME}/include/linux -I/usr/include/flint \
-olibarblib.so -lflint -lxdo| Symptom | Fix |
|---|---|
mvn: command not found |
sudo apt-get install -y maven |
Invalid filename: ??Field.java |
Set LANG=en_US.UTF-8 LC_ALL=en_US.UTF-8 and generate the locale |
flint_rand_struct undeclared |
You have FLINT 3.1/3.2 — upgrade to 3.3+ or apply the workarounds above |
no member named 'stride' |
Same — FLINT 3.1/3.2 issue |
java.lang.UnsatisfiedLinkError: no arblib |
libarblib.so is missing from project root — run make |
release version 26 not supported |
export JAVA_HOME=/usr/lib/jvm/java-26-openjdk-amd64 |
