Skip to content

Releases: ccam80/cubie

v0.0.7

20 Jan 19:55
7c8b428

Choose a tag to compare

0.0.7 (2026-01-20)

Features

  • add runtime logging infrastructure for GPU kernels and memory transfers (#289) (431425d)
  • add unified save_variables and summarise_variables parameters to solver interface (#342) (c7d7531)
  • cellml-generated systems now cached (#510) (92b21e0)
  • enable driver interpolator profiling in all_in_one.py (#419) (1574ec9)
  • File-based caching implemented (#491) (1bfe68b)
  • MultipleInstanceCUDAFactory subclass (and matching config) now handle cases like newton_atol and krylov_atol when instantiating multiple of the same base class (d674bcd)
  • scaled norm function now available as a CUDAFactory for repeated use (d674bcd)
  • Scaled tolerance in Newton-Krylov solver (#473) (d674bcd)
  • Solve functions now save the final value on loop exit when no timing parameters are given. (d9fb64b)
  • Time-domain save settings now decoupled from summary metric settings (d9fb64b)

Bug Fixes

  • build_grid now takes None parameters/initial conditions in solver.py (be4bab9)
  • chunking fails when VRAM is limited due to stride incompatibility (#487) (b8a486e), closes #438
  • codegen hashing now session-independent (and so working... this time) (59bb488)
  • correct false circular dependency error in topological_sort (#422) (97c13be)
  • default neumann preconditioner order set to 2 (2434358)
  • DIRK codegen pipeline now decoupled from rosenbrock cache planning. (80171b5)
  • dummy-kernel based compile time logging removed (it doubled compile time) (431425d)
  • Internal code generation variables prefixed to avoid name clashes (#466) (90e8ca3), closes #373
  • load_cellml_model surfaced to toplevel import (54b05e0)
  • loop now exits on irrecoverable-error status codes (ba800a0)
  • map CellML time variable to standard 't' symbol (#425) (261c109)
  • Newton-krylov solver no longer propagates krylov non-convergence or max_backtrack errors if it recovers (1411135)
  • Parsed system definition now hashed properly so generated code is properly cached (655e54a)
  • patch event.query() bug in numba by swapping handle (d43ae8f)
  • Refactor BatchGridBuilder -> BatchInputHandler (#437) (ffb8478)
  • repeated CSE calls no longer raise warning about an already-used CSE symbol. (16785f3)
  • Rosenbrock step config's hash now deterministic for caching (105befb)
  • Runs with impossible time settings now raise sensible errors (#465) (d46a7b2), closes #440
  • skip codegen on cache hit in get_solver_helper (#512) (cfc6b0b)
  • state-aware derivative detection to avoid misinterpreting d-prefixed auxiliaries (#468) (fee6d70)
  • Stride incompatibility fixed when array is sliced to fit in VRAM (b8a486e)
  • timelogger printouts now include cache hit/miss messaging (#497) (6323746)

Performance Improvements

  • Convert whole-module imports to explicit imports in CUDAFactory files (#443) (9cebe45)

Miscellaneous Chores

v0.0.6

27 Dec 01:37
e67da0c

Choose a tag to compare

0.0.6 (2025-12-27)

Features

  • raw output type added to output device array copies with no processing (22cef9f)
  • add time logging to cellml import (#257) (6a220f8)
  • Additional summary output metrics added (#212) (daccbae)
  • Buffer indexing, sizing, and locating now consolidated into a BufferSettings object (480df1a)
  • Buffer memory locations on GPU now user-selectable (between local and shared) (480df1a)
  • build_grid() surfaced to user API (#338) (45cfa90)
  • CellML to Cubie adapter layer added (#221) (b8f448e)
  • CUDAFactory.update() now updates nested dicts and attrs classes (480df1a)
  • device buffers are now togglable (local, shared, local persistent) and managed centrally by buffer_registry (39107a9)
  • Global stores now have cache write-through hints, closes #291 (480df1a)
  • load_from_cellml updated to parse complicated models (#238) (50341f6)
  • newton and linear solvers now fully-functional CUDAFactory subclasses (39107a9)
  • optional configuration parameters are no longer explicit in inits, they are filtered and collected in kwargs (39107a9)
  • py39 compatibility removed due to a Numba update. (480df1a)
  • Solver API now skips extra memory and grid-building work when possible (#324) (5f225a0)
  • summary metrics combined (eg. extrema, [mean, std, rms]) to reduce buffer space (daccbae)
  • summary metrics now respect numerical precision (daccbae)
  • Sympy inputs from CellML now go Sympy->Sympy instead of through strings (#259) (23e201d)
  • time logging added to parsing, codegen, and CUDA compilation (#256) (5d8f75b)
  • Trajectories with errors (nonzero status codes) now return NaNs in solveresult (#333) (6068ebc)
  • update() methods now unpack settings dicts provided to them (#332) (f016e4d)
  • Warp-friendly FSAL caching implented, redundant accumulation removed (#211) (96a9dd0)

Bug Fixes

  • _ensure_context() method added to avoid segfaults in CI (#265) (60acecc)
  • shift value in standard deviation calcs now updates after each save. (073d406)
  • Adaptive step controllers now sum errors correctly, closes #302 (480df1a)
  • add set_stride_order method for access from solver (e4b80d3)
  • All algorithms now exit when the next save time is > end time (480df1a)
  • buffer aliasing logic now child-location-agnostic (377ad10)
  • Compile-timing kernels now use device arrays (16bf1d6)
  • consolidate timing parameter sets for fewer fixture builds (#350) (4698e9e)
  • contiguous arrays now marked as such for the compiler to do its grim work (#276) (fe97291)
  • Controller-algorithm compatibility enforced (4ab0230)
  • correct FIRK algorithm implementation (#408) (39107a9)
  • correct fsal warp-vote implementation (60448f1)
  • Counters array added to ERK signature (99db833)
  • CUDA-breaking simsafe 'local' module adapter removed (d205979)
  • CUDAFactories now only precompile if timelogging is on (836da2a)
  • dead code and duplications pruned from codegenned device functions (#266) (de0dda8)
  • default precision types have been removed from all but entry points (39107a9)
  • fast path through batchgridbuilder for arrays provided verbatim added (080ab66)
  • FIRK no longer does an extra f(x) calculation after performing its nonlinear solve (39107a9)
  • Fixed steppers now use a slower but more sensible save time incrementing technique (480df1a)
  • FSAL warp test now doesn't break everything (2e68692)
  • gustafsson controller prev error ratio flipped, various test edits for GPU runs (3e04438)
  • host arrays now pinned to facilitate asynchronous transfers (480df1a)
  • infinite loops in adaptive steppers under dummy compile now finite (b0547fd)
  • lineinfo toggle for CUDA compilation now conditional on CUDASIM status (#280) (6d5bf57)
  • Loop iterators now 32-bit (480df1a)
  • loops now accumulate time in f64, no longer get stuck when dt < 1e-7 * time (#281) (480df1a), closes #272
  • make shape validator np.int compatible in BaseArrayManager.py (e77d800)
  • make stage_increment buffer persistent in rosenbrock (#411) (4b0b6bc)
  • Missing iteration_counters type added to device signature in BatchSolverKernel (6eefd67)
  • move stream sync function to after chunked queue (e4b80d3)
  • n_saves calculation now includes start time, closes #282 (480df1a)
  • numeric literals now wrapped with precision() or int32 in CUDA code generation (#258) ([21850f2](21850f293b6743e2fe677b4fc...
Read more

v0.0.5

04 Nov 07:36
fc42393

Choose a tag to compare

0.0.5 (2025-11-04)

Features

  • "Instrumented" device steps added for diagnostics (472f960)
  • additional ERK, DIRK, Rosenbrock tableaus added (ed866d7)
  • Algorithms now set their own step control defaults (#139) (2a0efed)
  • Array managers now support heterogeneous arrays within the same container (0905422)
  • Compile-settings updates now won't force a rebuild if the value hasn't changed. (5a9e281)
  • DIRK and DIRKTableaus added (472f960)
  • Explicit and Diagonally-Implicit Runge Kutta algorithms added (#151) (472f960)
  • Explicit RK and Tableau added, closes #83 (472f960)
  • Fully-Implicit Runge-Kutta (FIRK) Methods implemented (#162) (ed866d7)
  • Generic Butcher Tableau implemented (472f960)
  • Generic Rosenbrock-W methods added (#148) (6abfed2)
  • minimal FSAL caching added to DIRK, ERK, Rosenbrock (472f960)
  • N-stage flattened linear operators, preconditioners, nonlinear residual codegen added (ed866d7)
  • Parser now processes indexed arrays as variables (#152) (7fe800b)
  • rodasnp methods, dop853, tsit5, vern7 (87b53bc)
  • Rosenbrock methods now 100% more rosenbrock (#157) (35641e6)
  • status codes now aggregated by batchSolverKernel (0905422)
  • Steps and step controllers now have a unified argument-filtering factory (2a0efed)
  • Tableau libraries and tableau resolvers/getters added (472f960)
  • There are now auxiliary-cached jacobian functions for reusing some computational work. (6abfed2)
  • Third and fourth order SDIRK tableaus added (91ee1e9)
  • Time-derivative helpers added for symbolic functions and interpolated arrays (35641e6)
  • Very rough caching of jvp nodes implemented for rosenbrock solvers. (7fe800b)
  • working arrays and quantities in algorithms and solvers now draw from a memory "pool", allowing easier reuse (b0f5b6f)

Bug Fixes

  • add error to sdirk 4, correct controllers for loop tests (47a6cb1)
  • Added (non-CI) testing for DIRK loops added. (7fe800b)
  • batchgridbuilder class now has static wrappers for batchgridbuilder helper functions. (74c08fa)
  • BatchsolverKernel's compile settings now features the compile-critical settings it always should have had. (f383157)
  • Buffer footprints reduced by aliasing vectors with disjoint lifetimes (472f960)
  • correct off-by-one error and datatype discrepancy in cpu driver evaluator (472f960)
  • DIRK now accumulates rhs's and scales after stages, reducing round-off (7fe800b)
  • faulty implementation of ode23s removed, dop853 tableau amended (d5a784c)
  • matplotlib now spelled correctly in pyproject.toml (3773ff8)
  • Meaty loop tests confined to test_ode_loop.py (7fe5677)
  • numerous numerical errors amended after instrumenting steps (472f960)
  • precision type hints now correct; no more pesky yellow lines. (2a0efed)
  • Rosenbrock buffer footprint reduced by 2n (ed866d7)
  • settings passing from solver now less cramped and hopefully more robust (2a0efed)
  • Step controllers no longer mutate error vector (7fe800b)

Documentation

  • Add autodocs subpages for manual project structure docs (c2c0d7c)
  • add copilot instructions (#161) (2bfe6f2)
  • added "buffer map" comments to generic algorithms to demystify aliasing (472f960)
  • autodocs param lists now format one line per param (cb3d74b)
  • Batchsolving module source files now better documented in numpydocs format (97a0ab8)
  • de-computer some language in api reference, rejig indexes (b558994)
  • increase index depth, force one-param-per-line printing (b822f4c)
  • more docs organisation (5631989)
  • more docs organisation (cd98043)
  • refactor api structure; remove autosummary, implement manual docs (881ce4a)
  • Refs in "getting started" are now actually refs not highlighted garbage. (6abfed2)
  • top-level batchsolving docs added (6cfddfa)

Miscellaneous Chores

v0.0.4

05 Oct 21:24
1b917d4

Choose a tag to compare

0.0.4 (2025-10-05)

Features

  • Adaptive step-size controllers added : i (traditional), pi, pid, gustafsson acceleration (1d903f2)
  • Adaptive time-step controllers now have a programmable dead-band. (b0fafd9)
  • AGENTS.md extended and partially updated to summarise entire project for ai agents (1d903f2)
  • arbitrary drivers can now be looped or clamped to zero (smoothly). (b0fafd9)
  • Backwards Euler implicit fixed-step method added (with and without predictor-corrector mechanism), closes #114. (1d903f2)
  • Codegen for residual functions, jvps, and various solver helper functions created (1d903f2)
  • Crank-Nicolson trapezoidal adaptive-step algorithm implemented. (1d903f2)
  • cuda simulation patches consolidated for cuda-free environment tests. (1d903f2)
  • Forcing (driver) terms now adaptive-step friendly (#132) (b0fafd9)
  • matrix free solvers added (1dffd94)
  • Nonlinear Newton-Krylov iterative solver with preconditiong added for implicit methods, closes #101, #102, #111 (1d903f2)
  • plotting added to driver interpolator - keep an eye on what the machine is doing. (b0fafd9)
  • shared memory padding now closer to optimal (nothing to be done about 64-bit values), closes #86. (1d903f2)

Bug Fixes

  • Array "chunking" logic now respects "unchunkable" arrays in allocation (53141df)
  • CPU test step controllers now raise dt_too_small errors (19af8ff)
  • Crank-Nicolson and adaptive controllers now 50% more idiot-error free. (b0fafd9)
  • CUDAFactory now updates underscored config variables as intended (1d903f2)
  • Many edits to precision settings and flow throughout system (making it work) (1d903f2)
  • Observables calculation now occurs in sync with state for adaptive-step loops (#130) (19af8ff)
  • pypi version tag trigger now using correct syntax (3b3caa1)
  • some sign confusion and bogus gains corrected in step controllers (19af8ff)

Documentation

  • Batch arrays re-docstringed in keeping with rest of library. (b0fafd9)
  • fix circular import for docs building (64d8933)
  • manual docs added for integrators, memory modules. submodules of systems, outputhandling documented. (1d903f2)
  • step controller comparison added to docs/examples (19af8ff)
  • top-level summaries of odesystems and outputhandling (a9cb4c0)
  • update docstrings in outputhandling and odesystems root directorys (87bb133)

Miscellaneous Chores

v0.0.3

04 Sep 09:25
10842c1

Choose a tag to compare

0.0.3 (2025-09-04)

Features

  • Parser accepts and translates sympy and user-provided functions (#108) (25f3c5d)
  • Symbolic input parsing added (14577c1)
  • symbolic interface and analytical jacobian generation added (14577c1)

Bug Fixes

  • BaseArrayManager.py once again contains all of it's methods, after half of the class was sausage-fingered clean off. (96ba1cb)
  • buggy regex removed from pyproject (bc5606a)
  • ignore generated python files (14577c1)
  • implement four-byte padding to reduce shared memory conflicts (944066e)
  • jacobian product codegen now does a simple dead-code removal sweep at the expression level (5a46559)
  • metric function compilation now deferred until fn requested. (fc42de7)
  • sympy piecewise printing patched in subclass (ca20f84)
  • SystemValues now interpreting sympy symbols correctly (0549bea)
  • SystemValues, BaseODE now have comprehensible repros (0549bea)

Documentation

  • batchsolving module now has all docstrings in numpydocs-friendly format (8e6e8cf)
  • conf.py path reverted for Sphinx build (5383de6)
  • docs updated and thinned to match structure (9df6978)
  • first-pass narrative docs added (14577c1)
  • get sphinx-build working again and ReadTheDocs themed (6df9a82)
  • insert google verification tag, cross-link repo and docs (0d61cd2)
  • integrators section docstrings brought in line with numpydocs format (8b22790)
  • memory section docstrings brought into numpy format (d495551)
  • output_functions section docstrings brought into numpy format (4e7b2c0)
  • pypi version badge added (8b22790)
  • readme now has code coverage badge (8e6e8cf)
  • systemmodels section docstrings brought into numpy format (9df6978)

Miscellaneous Chores

v0.0.2

18 Aug 20:46
54402ed

Choose a tag to compare

0.0.2 (2025-08-18)

Features

  • BaseArrayManager class added to unify approach to allocating/deallocating device arrays through the memory manager (416e363)
  • BatchConfigurator now accepts extra user input types to match usage (a345679)
  • BatchInputArrays and BatchOutputArrays now subclass BaseArrayManager (808695e)
  • Memory Manager extended to queue and process requests from multiple objects (67e66bf)
  • MemoryManager implemented (3387142)
  • Solver interface now set up for real people to use (I think) (2b05c1e)
  • UserArrays now handles delivery from device arrays to end user for inspection (9d93a7a)

Bug Fixes

  • BatchSolverKernel now uses blocksize to calculate dynamic shared memory correctly, and reduces blocksize if it's > limit (5d8263f)
  • bug in previosu commit: BatchSolverKernel now uses blocksize to calculate dynamic shared memory correctly, and reduces blocksize if it's > limit (afda7cb)
  • fix circular imports introduced in 636b5e3 (2865160)
  • force odd shared memory size per run to minimise clashes. (fc12c76)
  • forward declarations no longer cause circular imports (e3841ff), closes #73
  • Newly-initialised memory manager no longer breaks in CUDA sim (db99862)
  • output config flags now treated as derived quantities instead of attributes (9d93a7a)
  • pyproject.toml now points at correct license and readme files for building. (f7bd7f0)
  • pyproject.toml now points at correct license file (but for real this time) (93591ed)
  • SolverKernel now solves and summarises accurately. (879da10)
  • SolverKernel tests made CUDA-simulator-friendly (ad702bb)
  • UserArrays now SolveResult, and works with array managers to produce a sensible output (994cd7c)
  • UserArrays.as_numpy now returns copies rather than mappedarrays (01af17d)

Documentation

  • Docs don't mention CuMC anymore, and we shall never speak of it again (fff2e00)
  • Docs updated to reflect cubie refactor (205e748)
  • Properties which expose lower-level attributes are now docstringed as such (a97d368)

Miscellaneous Chores

v0.0.1

01 Aug 02:24
8494714

Choose a tag to compare

0.0.1 (2025-08-01)

Features

  • BatchConfigurator implemented and passing tests. (0bc00f2)
  • BatchConfigurator implemented and passing tests. (f4859e4)
  • BatchConfigurator implemented and passing tests. (edf60fa)
  • first attempt at a batch configurator (4d723ed)
  • Initial dev version release. Not all features documented in changelog; some commit messages poorly named. Expect better changelogs in subsequent releases (1a7691e)

Bug Fixes

  • Add a nozeros toggle to array sizes (1300f3f)
  • ci: Fix release-please label permissions (fa5d77f)
  • complete GPU tests on release tag (f622a03)
  • complete GPU tests on release tag (f622a03)
  • correct import error in ODEData.py (cdd3144)
  • Corrected ancient typo in SystemValues that made all parameter setting invalid, confirmed test coverage (c05f532)
  • fix circular dependency, improve arraysizes interface (98ecf54)
  • git: close issue #41 (da83993)
  • git: remove local junk from repo #41 (a3cb122)
  • Implemented adapters for array size and allocation classes (7a4da6a)
  • Improve adapters and access to output_sizes objects through higher objects (649e8cf)
  • Output array indices now gated by boolean flags to avoid memory access errors (4f5628c)
  • OutputHandling: switched indexing order in 2d arrays to match intended striding. (86a1d2f)
  • Plumbing now works between lower-level modules (d79cfdb)
  • remove bug introduced in time-saving (ce96d23)
  • remove doto #19 (d3457bd)
  • Remove todos to close issues (f2e7638)
  • Set buffer height methods in output_functions to properties to follow the convention for the rest of the module (3a5c387)
  • SingleIntegratorRun and children now re-validate timing after an update (0baad64)
  • Swat rename bug in summary_metrics testing (cffd94d)

Miscellaneous Chores