Skip to content

DoE objective evaluation in pytorch#525

Merged
jduerholt merged 28 commits intomainfrom
doe_pytorch_based_model_evaluation
Apr 7, 2025
Merged

DoE objective evaluation in pytorch#525
jduerholt merged 28 commits intomainfrom
doe_pytorch_based_model_evaluation

Conversation

@Osburg
Copy link
Collaborator

@Osburg Osburg commented Feb 16, 2025

This PR adds the possibility to do the model matrix evaluation also in pytorch using autograd. This has several advantages:

  • With minor changes the formulaic dependency can be dropped (have not removed it everywhere yet, but will do if requested)
  • The full jacobian evaluation can be done with a single gradient calculation using autograd. This speeds up the function/gradient evaluations by a lot in the small example below the time to find an optimal design drops from 3.2s to 0.3s on my machine
domain = Domain(
    inputs=[
        ContinuousInput(key="x1", bounds=(0, 1)),
        ContinuousInput(key="x2", bounds=(0, 1)),
        ContinuousInput(key="x3", bounds=(0, 1)),
        ContinuousInput(key="x4", bounds=(0, 1)),
        ContinuousInput(key="x5", bounds=(0, 1)),
    ],
    outputs=[ContinuousOutput(key="y")],
    constraints=[
        LinearEqualityConstraint(features=[f"x{i}" for i in range(1, 6)], rhs=1, coefficients=[1 for i in range(5)]),
    ]
)

data_model = DoEStrategy(
    domain=domain,
    criterion=DOptimalityCriterion(formula="fully-quadratic"),
    ipopt_options={"disp": 3, "timing_statistics": "yes", "linear_solver":"ma57", "hsllib":"libcoinhsl.dylib"},
)
strategy = strategies.map(data_model=data_model)
candidates = strategy.ask(candidate_count=50)
  • These changes facilitate the usage of Hessian information during optimization using torch.autograd.functional.hessian(), since also nonlinear constraints can be defined via pytorch functions, all Hessians should by now be convenient to calculate.

  • Tests are missing, first wanted to wait for your opinions @jduerholt, @dlinzner-bcs

  • Also I kept the option to do the objective evaluation the old way (but should be removed eventually)

@jduerholt
Copy link
Contributor

Hi @Osburg,

this looks really great. I love it. For me it would be fine to completely drop the formulaic support. It is also a really clever way of making the formulas torch executable. One question: by explicitly setting the locals, is it safe to injections? Have a look also at this discussion (meta-pytorch/botorch#1279). But nevertheless, I would definitely go for the new router.

Btw, this would also make point two in this discussion here: #514 (comment) obsolete ;)

Two other comments:

  • Currently we are evaluating the objective three times, one time for f, one time for f' and one time for f''. This is super inefficient, and we could get more speed here. In scipy minimize (which shares the same interface with minimize_ipopt, it is possible to set jac=True, in this case it is assuming that fun returns both the functions value and its gradient (https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html). Botorch, for example is exploiting this behavior. Based on some experimentation with botorch and ipopt (IPOPT via cyipopt meta-pytorch/botorch#2368), I think the same is possible for minimize_ipopt and could give us further improvements. If it is not possible (and I especially do not know if it is supported for hessians), we could find out in which order the calls are executed (I assume f, f' and then f''), we could calculate in the method which calculates the objective directly all three parts, only return f, but write f' and f'' to class attributes and just return the class attributes when the jacobian and hessians are called. This means, that we would cache the information. To have it more clean, we should then also put this in an own class.
  • We should use the fact that ipopt and scipy.minimize share the same interface and should automatically switch to scipy.minimize if ipopt is not available. We should raise also in this case a warning to the user, which tells him that ipopt is the preferred optimizer, this makes installation for the users easier and prevents things like this: Fix that BoFire is not working without cyipopt #522.

@Osburg
Copy link
Collaborator Author

Osburg commented Feb 17, 2025

Hey @jduerholt,

cool, thx for the feedback :)

  • I do not know much about the security stuff tbh, but since the user can pass any string as a formula I guess it is probably not safe...
  • Ok, will remove everything related to formulaic.
  • Regarding multiple evaluations: Currently we only evaluate twice, since we do not use hessians, but yes, in principle it is up to three times. I know that at least ipopt supports evaluating all three functions at the same time using the C++ interface, but I guess this cannot be accessed via cyipopt. In a quick test and the cyipopt reference (https://cyipopt.readthedocs.io/en/stable/_modules/cyipopt/scipy_interface.html#minimize_ipopt) it looks like also setting jac=True is not an option. But implement the caching ourselves sounds good... I am just not sure if it will actually lead to a speedup, bc I think during the line search only the function (but no derivatives) is evaluated (possibly many times). In this case we would get an overhead from evaluating the derivatives in the cost function call by default, right?. Maybe we just have to do some experiments what parts dominate the evaluation times.... Regarding the order of function/derivative evaluation: we could solve this by having three identical implementations for f, f' and f'' that all evaluate and cache the function value and the derivatives, and only differ in which of the values they return in addition, right?
  • falling back to scipy.minimize in case cyipopt is not installed sounds good. Will also address this in this PR.

Cheers
Aaron :)

@jduerholt
Copy link
Contributor

But then, let us keep the caching stuff out of this PR, but include the fallback to scipy.minimize, as this makes the whole installation and maintaining process of the package easier ;)

@Osburg
Copy link
Collaborator Author

Osburg commented Feb 17, 2025

Again regarding the removal of formulaic: we will need some basic parser from the user input to a string that we use for model evaluation. Implementing a parser for formulas in wilkinson notation as general as the one from formulaic is not worth the effort i guess (then we could just stick to formulaic and have less pain :D).
If we take the naive way and just split terms everywhere, where a "+" occurs, this will lead to errors when someone wants to add a "non-standard" term like "log(x1+x2)". Also, not being able to use abbreviations for higher-order models such as "x1x2x3*x4" instead of "1 + x1 + x2 + x3 + x4 + x1:x2 + x1:x3 + x2:x3 + x1:x4 + x2:x4 + x3:x4 + x1:x2:x3 + x1:x2:x4 + x1:x3:x4 + x2:x3:x4 + x1:x2:x3:x4" might be something that annoys users, right?
We could of course define our own system, but it would probably not fix the short notation problem unless we are willing to write a lot of code for parsing.
Therefore I am reconsidering rn if it actually is a smart idea to remove the formulaic dependency (without adding a different one like patsy).

If we stick to removing formulaic i would suggest to expect from the user to define the model term by term, i.e. they would have to write out the long expression above. The only special rule that I would implement is that a constant term "1" is added unless a term "- 1" occurs in the formula.

What do you think @jduerholt?

Best
Aaron :)

@jduerholt
Copy link
Contributor

@Osburg: from my understanding, the main advantage of using pytorch for everything is much faster solving of the problems, or? And getting rid of formulaic is a nice thing in addition, or? Is there a possibility for having formulaic and the much faster designs?

cc: @dlinzner-bcs @bertiqwerty what is your optinion? Which more complex terms needs to be implemented? Writing the full formula is fine for me, as I always have to look up this wilkinson notation ...

@Osburg
Copy link
Collaborator Author

Osburg commented Feb 18, 2025

So far formulaic had two purposes:

  • once in the beginning parse the formula to a list of terms
  • evaluate the model matrix in every cost function or gradient call (this is what slows down everything and does not allow for calculating the gradients fully in pytorch)

The idea of this PR is still to get rid of the second point. What I am not convinced of anymore is that we should also get rid of formulaic when it comes to the first point, bc this step is done only once in the beginning and allows us to not painfully implement our own formula parser covering all kinds of edge cases.

Edit: So yes! There is a possibility of having both formulaic and the much faster designs! Which is to not use formulaic for point 2, but for point 1! :)

@dlinzner-bcs
Copy link
Contributor

@Osburg: from my understanding, the main advantage of using pytorch for everything is much faster solving of the problems, or? And getting rid of formulaic is a nice thing in addition, or? Is there a possibility for having formulaic and the much faster designs?

cc: @dlinzner-bcs @bertiqwerty what is your optinion? Which more complex terms needs to be implemented? Writing the full formula is fine for me, as I always have to look up this wilkinson notation ...

I think keeping formulaic for the parsing is the sensible choice. So no implementation of function parsing needed from my side.

@dlinzner-bcs
Copy link
Contributor

Thank you @Osburg very much for this cool addition! Do you think it will help in making our tests more stable, as passing the jacobians will make the optimizers job easier -> less constraint violations?

@Osburg
Copy link
Collaborator Author

Osburg commented Feb 18, 2025

Hmm, I doubt that it will improve the designs by a lot, since I think we did not have accuracy problems before. It is just a lot faster. So the tests will run faster, but not "better". I think the reason for the instability is that sometimes we end up in a different local minimum that the global minimum, to improve stability of the tests I guess we would have to add some more near-optimal designs which are acceptable for passing the tests.

I thought the constraint violation issues were fixed by not throwing an error when a constraint is slightly violated but only a warning, right? At least in the cases that I had a look at there has never been a significant constraint violation, but only idk a violation of 1e-4 instead of the tolerated 1e-6 (which is both a practically fulfilled constraint from my understanding :D)

@bertiqwerty
Copy link
Contributor

Thanks Aaron! Looks like a nice speed-up.

  • Formulaic does allow evaluation of arbitrary Python code, right? I think it is a security risk which was always present. Hence, it is the responsibility of the users of BoFire to make sure their applications are safe. If we keep Formulaic, we do not change much. I would like to see it replaced by something safer though.
  • The eval
    def _evaluate_tensor(self, D: Tensor) -> Tensor:
        """Evaluate the objective function on the design matrix as a tensor."""
        var_dict = {var: D[:, i] for i, var in enumerate(self.vars)}
        var_dict["torch"] = torch
        X = eval(str(self.model_terms_string_expression), {}, var_dict)
        return self._criterion(X)

probably also induces a risk because of the self.vars.

  • We should definitely not re-write a parser from scratch.

I am fine merging this variant, since security does not get worse but speed gets better.

@Osburg
Copy link
Collaborator Author

Osburg commented Feb 18, 2025

Ah, another quick thing. Off-topic, but can you @dlinzner-bcs or @bertiqwerty ask Rosona if my mails arrive her? :D Sent her an answer some time ago and I am not sure if it landed in spam or she is just busy.

@jduerholt
Copy link
Contributor

@Osburg: from my understanding, the main advantage of using pytorch for everything is much faster solving of the problems, or? And getting rid of formulaic is a nice thing in addition, or? Is there a possibility for having formulaic and the much faster designs?
cc: @dlinzner-bcs @bertiqwerty what is your optinion? Which more complex terms needs to be implemented? Writing the full formula is fine for me, as I always have to look up this wilkinson notation ...

I think keeping formulaic for the parsing is the sensible choice. So no implementation of function parsing needed from my side.

I would then also opt for keeping formulaic as parser but go for pytorch for differentiation. I think this is the best compromise, then we couple it with scipy.minimize as fallback optimizer if ipopt is not available.

@Osburg Osburg changed the title Draft: DoE objective evaluation in pytorch DoE objective evaluation in pytorch Mar 12, 2025
@Osburg
Copy link
Collaborator Author

Osburg commented Mar 30, 2025

Hi @jduerholt, I addressed the points you mentioned in your review. Also, I dropped the access of ipopt using the scipy interface, and instead I use the cyipopt.Problem class which allows to use a sparse implementation for the constraint jacobian, which speeds up the optimization in large design spaces if constraints are present. Do you think it is ready to be merged?

@Osburg Osburg requested a review from jduerholt March 30, 2025 20:14
@jduerholt
Copy link
Contributor

Hi @Osburg, I am on a business trip, I will have a final look the latest on Wednesday, hope this fine!

One question: how does removing the scipy interface relates to our idea to use a default scipy optimizer (LBFGS, SLQP) when IPOPT is not installed?

Best,

Johannes

@Osburg
Copy link
Collaborator Author

Osburg commented Apr 1, 2025

Sure, no hurries. In the last commit I added the fallback to scipy.minimize() if ipopt is not available. Cheers, Aaron :)

Copy link
Contributor

@jduerholt jduerholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Thanks. I only let some minor comments.

We also need to update the doc that it is not necessary to have ipopt installed. But we can do this in a seperate PR.

@Osburg
Copy link
Collaborator Author

Osburg commented Apr 4, 2025

Hey @jduerholt, I think I addressed all the points you mentioned in your review in the last commit. Cheers, Aaron :)

@Osburg Osburg requested a review from jduerholt April 4, 2025 10:49
@jduerholt
Copy link
Contributor

Looks good to me. Can you merge main in? The we can merge it.

@Osburg
Copy link
Collaborator Author

Osburg commented Apr 4, 2025

An error occurs in test_functional_constraint() which occurs because the sparse jacobian makes use of the fact that IntraPointConstraints only operate on single experiments (this is the whole reason why the jacobian is sparse). However, the constraint defined by

  def calc_volume_content(A, B, T, W, W_T):
      A = A.detach() if not isinstance(A, pd.Series) else A
      B = B.detach() if not isinstance(B, pd.Series) else B
      T = T.detach() if not isinstance(T, pd.Series) else T
      W = W.detach() if not isinstance(W, pd.Series) else W
      W_T = W_T.detach() if not isinstance(W_T, pd.Series) else W_T
      volume_solid = (
          A * raw_materials_data["A"][0] / raw_materials_data["A"][1]
          + B * raw_materials_data["B"][0] / raw_materials_data["B"][1]
      )
      volume_total = volume_solid + (1 - calc_solid_content(A, B, T, W, W_T) / 1)
      return volume_solid / volume_total

seems to refer to multiple experiments. Also, it is not fully written as torch function. In order to do so calc_solid_content should look like this or so:

    def calc_solid_content(A, B, T, W, W_T):
        # Ensure same order as in the dictionary containing the material properties
        return torch.stack([A, B, T, W, W_T],0).T @ torch.tensor(df_raw_materials["sc"].values)

@dlinzner-bcs is this your test? --> could you pls check if it is actually implemented the way it should be.

Cheers
Aaron :)

Edit: I changed calc_solid_content() to use torch functions, this removes inaccurate gradient calculation

TobyBoyne and others added 9 commits April 6, 2025 10:13
* Add chemistry flavour to getting started

* Fix input key in doe plots

* Add BayesOpt examples to examples.md

* Create symbolic links for examples
* refactorings and thoughts

* added structure for data-models for Optimizers

* after hooks

* moved optimizer-dependent data-model fields, and validators

* after hooks

* moved descriptor/categorical/discrete method field from botorch strategy model, to botorch optimizer data-model

* after hooks

* moved data-model tests

* renamed appearances of num... to n...

* data-model tests run

* after hooks

* included Optimizer mapping in mapper_actual.py

* initialized optimizer in strategy __init__, using mapper function

* moved methods, WIP changing input structures

* WIP: moving methods to Optimizer

* WIP

* WIP: changing method signatures

* after hooks

* some dubugging

* after hooks

* adapted (some) tests to acq. optimizer data model call

* added testfile for optimizer

* added strategy mapping for benchmark fixes

* debugged: missing domain,... args in function calls

* after hooks

* moved discrete optimization to Optimizer base class
Added flag in data-model controlling exhaustive categorical search space optim

* added categoric search to optimizer base-class

* added test for all-categoric optimization (optimize_acqf_discrete)

* after hooks

* removed commented test-cases

* added type-annotations

* changed method "is_constraint_implemented" from classmethod to conventional method. For Bofire strategies, this matches to the same method on the optimizer

* after hooks

* WIP: debugging tests

* WIP: debugging tests

* WIP: debugging tests

* WIP: debugging tests

* fixed invalid test

* Update bofire/data_models/strategies/predictives/acqf_optimization.py

Co-authored-by: Johannes P. Dürholt <johannespeter.duerholt@evonik.com>

* Update bofire/data_models/strategies/api.py

Co-authored-by: Johannes P. Dürholt <johannespeter.duerholt@evonik.com>

* easy fixes from PR requirements

* after hooks

* fix qparego.py constraint function -> reference to general strategy validation function

* after hooks

* made "is_constraint_implemented" an abstract method

* moved Botorch-Optimizer specific validators to BotorchOptimizer data-model

* after hooks

* used AnyAcqfOpt... instead of abstract base class in BotorchStrategy

* after hooks

* moving BotorchOptimizer dependent parts of "get_fixed_features" to BotorchOptimizer class

* moved Botorch-parts of "get_categorical_combinations" to BotorchBoptimizer

* after hooks

* debugged data-model tests

* after hooks

* lsrbo should be working again

* fix notebook

* fix notebook

* fix type

* update doc string on lsrbo

* fix lsrbo

* fix notebook

* update tests

* move method

* implemented requested changes

* after hooks

---------

Co-authored-by: gdiwt <lukas.hebing@bayer.com>
Co-authored-by: Lukas Hebing <92095234+LukasHebing@users.noreply.github.com>
* Remove try/except when calling `sample_q_batches_from_polytope`

* Fix bug where seed is randomized if seed==0

* Strategy now uses a seed sequence instead of rng to generate seeds

* Stategy.__init__ uses the Seed Sequence directly to initialise the seed

* Random seed for strategy now limited to 32 bits

* Bump python version in workflow

* Increase fidelity threshold for MF test
…559)

* add retries

* check if test ever passed

* add comments

* reintroduce sampling for seeding does

* reintroduce sampling for seeding does

---------

Co-authored-by: linznedd <linznedd@basfad.basf.net>
@Osburg
Copy link
Collaborator Author

Osburg commented Apr 6, 2025

@jduerholt I addressed the points you mentioned and fixed the test test_functional_constraints() which had the problem that the function was not fully implemented with torch functions. This led to worse constraint fulfillment on my machine, I fixed that now. If there are no new points that should be addressed first, I think the PR is ready to be merged. What do you think?
@jduerholt @bertiqwerty speaking of the functional constriants. Do you have any suggestion how to make constraints defined by python functions serializable?
Cheers, Aaron :)

Copy link
Contributor

@jduerholt jduerholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Thanks!

@jduerholt
Copy link
Contributor

@jduerholt @bertiqwerty speaking of the functional constriants. Do you have any suggestion how to make constraints defined by python functions serializable?

Puuh, this is a good question. I think, there is not a good solution to this.

@jduerholt jduerholt merged commit d700413 into main Apr 7, 2025
27 of 30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants