Skip to content

Anderson acceleration for fixed point iteration#217

Open
aidancrilly wants to merge 23 commits intopatrick-kidger:mainfrom
aidancrilly:ac/AndersonFPI
Open

Anderson acceleration for fixed point iteration#217
aidancrilly wants to merge 23 commits intopatrick-kidger:mainfrom
aidancrilly:ac/AndersonFPI

Conversation

@aidancrilly
Copy link

Continued of closed #215 - apologies if this is not the way to re-open

I have implemented a form of Anderson acceleration for fixed point problems which should increase the rate of convergence in fixed point problems over standard fixed point iteration. Mentioned in #3

Here is a simple example for cos(x) = x:
image

A few queries:

  • I had to run the tests at very low tolerance for the finite difference jvp based tests to pass on the new solver. Through some separate investigation this seems to be from the small epilson used and the potentally non smooth behaviour of solver when you allow larger tolerances. Happy to investigate further if this is of concern.
  • The default fixed point iteration solver appears to work for complex numbers but I doubt the Anderson acceleration would (not something I know about). I assume it is fine to mark it as not compatible with complex and skip these tests as I have done?
  • I needed a helper routine (_batched_tree_zeros_like) very similar to one in LBFGS but reversed shape. Don't know if this is confusing or if more generic helpers would be of use.

aidancrilly and others added 21 commits February 7, 2026 09:31
* first pass at refactoring cauchy_point function, so far untested

* add a pretty drawing

* minor tweaks

* refactored cauchy point finding function

* add clarifying comment

* adding test cases

* bugfix: pick correct next intercept in the presence of infinite values

* limit step length to a full gradient step

* add expected results to cauchy point test cases

* add clarifying comment: Hessian operator is assumed to be positive definite.

* add explanations for test cases

---------

Co-authored-by: Johanna Haffner <johanna.haffner@bsse.ethz.ch>
* Drop support for Python 3.10

* Update readme.md and index.md

* CI: tests on 3.11 and 3.13
* Implement pytest-benchmark based setup for systematic performance evaluation of Optimistix' solvers.

* version bump for sif2jax requirements

* add semi-recent matplotlib version to specify a minimum

* no more monkeypatching

* set EQX_ON_ERROR with os.environ

* give a reason for skipping compilation tests

* Add L-BFGS solvers to benchmark suite

* clarify what --benchmark-autosave will do.

* remove strict dtype promotion rules - benchmarks are not tests, so we don't need them here. We would otherwise have to use context management for any comparison to Optax minimisers.

* state purpose of --scipy flag more clearly.

* improve contribution guidelines, inline decorator, specify pyright errata

* pyproject.toml from main

* add sif2jax

* move benchmark dependencies to tests group

* add benchmark-skip option

* add example to contributing guidelines, document OrderedDict workaround, adapt to sif2jax usage of properties.

---------

Co-authored-by: Johanna Haffner <johanna.haffner@bsse.ethz.ch>
This has been a subtle long-standing bug! This mainly manifested as BestSoFar seeming to be a squiffy.

Fixes patrick-kidger#33.
Remarkably, this bug seems to have existed since Optimistix day 1: if a solver returns a non-successful value halfway through the solve then it does not cause termination. Instead we keep running and then check once the solve has completed.
@aidancrilly
Copy link
Author

Just a note that I now use this on a problem with ~1000 parameters which required aggressive damping with fixed point iteration (damp = 0.95) and 100s iterations, now with Anderson acceleration this problem runs undamped and converges in 10s of iterations!

Copy link
Owner

@patrick-kidger patrick-kidger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, this looks really clean to me! I have just some minor nits.

Can you also add this to the documentation?

# First iterate must be treated differently
def _first_iterate():
new_y = jtu.tree_map(
lambda y_old, y_next: self.damp * y_old + (1 - self.damp) * y_next,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: if isinstance(self.damp, (int, float)) and self.damp == 0 – which is the fairly common default, and is something known at compile time – then we can skip this operation and probably save some effort.

This is the kind of thing that XLA will typically not compile out because in principle 0 * nan = nan, so it can't apply the 'mathematical' optimization that 0 * x = 0.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah interesting! I did not know it might not be compiled out, made the change here (as well as in the damped fixed point iteration scheme).

@aidancrilly
Copy link
Author

@patrick-kidger thanks, I believe this is all nits resolved and docs updated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants