Skip to content

Add register() functions for strategies and surrogates#736

Merged
jduerholt merged 14 commits intomainfrom
feature/register
Mar 12, 2026
Merged

Add register() functions for strategies and surrogates#736
jduerholt merged 14 commits intomainfrom
feature/register

Conversation

@jduerholt
Copy link
Contributor

Motivation

Provide public API to register custom strategy and surrogate mappings instead of requiring users to mutate internal dictionaries directly. For BotorchSurrogate subclasses, registration automatically rebuilds the AnyBotorchSurrogate union and BotorchSurrogates Pydantic model so custom surrogates work in botorch-based strategies out of the box. It addresses the discussion in this PR: #731

@TobyBoyne @bertiqwerty @LukasHebing @KislayaRavi what do you think?

This is a draft which should serve for discsussion, code build with Claude Code.

Have you read the Contributing Guidelines on pull requests?

Yes.

Have you updated CHANGELOG.md?

Not yet.

Test Plan

Unit tests.

Provide public API to register custom strategy and surrogate mappings
instead of requiring users to mutate internal dictionaries directly.
For BotorchSurrogate subclasses, registration automatically rebuilds
the AnyBotorchSurrogate union and BotorchSurrogates Pydantic model
so custom surrogates work in botorch-based strategies out of the box.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@TobyBoyne
Copy link
Collaborator

I think this looks correct to me! A few minor thoughts:

  1. I would quite like a decorator syntax. So I can write something like below: that way, I can easily see that the new surrogate is registered right next to where it is defined.
@surrogates_api.register(CustomBotorchSurrogateDataModel)
class CustomBotorchSurrogate(Surrogate):
    def __init__(self, data_model, **kwargs):
        ...
  1. Could we initialize SURROGATE_MAP to be empty, and register all of the BoFire models using this new API? This would mean, for example, we would no longer need to maintain the default values in _BOTORCH_SURROGATE_TYPES.
  2. I think the data_model_transform should be a separate thing to register. For example, if I want to have a minimal installation of BoFire (without installing botorch), I would not be able to register a custom transform. Maybe that is okay, since I think it is only ever needed in combination with mapping to an actual strategy, but I still think something that takes as input+output purely data models should be separate from the mapper onto real strategies. But I don't feel this particularly strongly so feel free to ignore.

I think this PR also breaks using AnyBotorchSurrogate in static typing (I'm getting a few new Pylance errors), but it isn't used as a type hint anywhere in the codebase so not a big deal.

jduerholt and others added 3 commits February 24, 2026 09:27
…mic Pydantic union rebuilding

Extend the register() pattern to kernels, priors, and engineered features
mappers so custom types can be registered with both the mapper dictionaries
and Pydantic unions. Registration dynamically patches model annotations and
triggers model_rebuild() in the correct cascade order (priors → kernels →
aggregation kernels → surrogates → BotorchSurrogates) so custom types pass
Pydantic validation when used in surrogate model fields.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…features

Replace the manually maintained KERNEL_MAP, PRIOR_MAP, and AGGREGATE_MAP
dictionaries with @register() decorators on each map function. The dicts
are now initialized empty and populated at import time via the same
register() function that external users call for custom types.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@jduerholt
Copy link
Contributor Author

@TobyBoyne: I was just pushing it further, I need to do an in-depth review here, but is this what you were looking for?

@TobyBoyne
Copy link
Collaborator

Yes that syntax looks very nice, and I like the decorator being included where the kernels/priors are defined.

I was also wondering why do we use, for example, AnyBotorchSurrogate instead of just BotorchSurrogate? What is the benefit of the union type instead of just type hinting as the parent type? It would avoid needing all of this model rebuilding.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we also need _rebuild_dependent_strategy similar to rebuild_dependent_models if we want full dynamic Pydantic acceptance of new strategy data-model types?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, correct. Should be fixed now. My Agent was overlooking this.

Copy link
Contributor

@bertiqwerty bertiqwerty Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, correct. Should be fixed now. My Agent was overlooking this. (last week)

Haha. Nice try blaming the agent. :D

@KislayaRavi
Copy link
Contributor

I just had one comment on the code.
Additionally, I think we should also a Quarto doc tutorial with an example demonstrating steps to externally register kernel or surrogate.

@LukasHebing
Copy link
Contributor

Yes that syntax looks very nice, and I like the decorator being included where the kernels/priors are defined.

I was also wondering why do we use, for example, AnyBotorchSurrogate instead of just BotorchSurrogate? What is the benefit of the union type instead of just type hinting as the parent type? It would avoid needing all of this model rebuilding.

I think this is needed for de-serialisation of objects from dicts / json. Having a pydantic class with a field surrogate: AnyBotorchSurrogate makes the de-serialiser go through all classes in AnyBotorchSurrogate, until the fieldnames match.

…strategy rebuild

- Extract _patch_field to shared bofire/data_models/_register_utils.py with
  patch_field() and append_to_union_field() helpers, reused across kernels,
  priors, and strategies
- Make AnyContinuousKernel and AnyCategoricalKernel list-backed and dynamic;
  register_kernel auto-detects sub-category from ContinuousKernel /
  CategoricalKernel base class and updates MixedSingleTaskGPSurrogate
- Replace lazy imports with data_models.register_kernel() and
  data_models.register_prior() in mapper register() functions
- Remove meta parameter from strategy register (only ACTUAL_MAP)
- Add register_strategy() to data models that rebuilds ActualStrategy
  union, Step, and StepwiseStrategy so custom strategies pass validation
- Add introspection tests that verify _rebuild_dependent_models covers
  all AnyPrior/AnyPriorConstraint/AnyKernel fields in the codebase

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Contributor Author

@jduerholt jduerholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments from my side.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, correct. Should be fixed now. My Agent was overlooking this.

jduerholt and others added 2 commits February 24, 2026 16:05
ConditionalEmbeddingKernel.base_kernel and WedgeKernel.base_kernel had
hardcoded inline Union types that were not patched by
_rebuild_dependent_models, so newly registered kernel types were not
accepted. Both parent and child need explicit patching because Pydantic
gives each class its own copy of model_fields.

Also enhances the introspection test to detect inline kernel Union
fields on Kernel container classes, preventing this class of regression.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Resolves conflict in bofire/surrogates/engineered_features.py by
combining the register() decorator pattern from feature/register with
main's refactored private functions (_weighted_features,
_map_weighted_feature, _map_molecular_weighted_feature) and partial()
bindings for sum/mean variants.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@jduerholt
Copy link
Contributor Author

jduerholt commented Feb 24, 2026

Ok, I did more stuff here, this is now almost final @bertiqwerty can you have a look? what do you think? I will also add a tutorial etc. But would be nice to get your assesment! And of course also the assesment from the others!

@bertiqwerty
Copy link
Contributor

Ok, I did more stuff here, this is now almost final @bertiqwerty can you have a look? what do you think? I will also add a tutorial etc. But would be nice to get your assesment! And of course also the assesment from the others!

Sure. I will look into it.

The register() infrastructure uses Union[tuple(list)] to build unions
at runtime.  This is valid Python but not statically analysable, so
tell ty to ignore the diagnostic.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@jduerholt jduerholt marked this pull request as ready for review February 26, 2026 21:21
jduerholt and others added 3 commits February 27, 2026 09:51
Documents how to use the register() API for strategies, surrogates,
kernels, and priors with practical code examples covering both
decorator and direct-call forms.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The `model_fields` attribute is a Pydantic BaseModel class attribute that
ty cannot resolve on the generic `type` parameter. Add inline ignore
comments on all 5 access sites.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
ruff format splits long lines, moving the ignore comment away from the
`.model_fields` access that triggers the error. Place the comment on
the first line of each expression so ty sees it on the correct line.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@KislayaRavi
Copy link
Contributor

I faced an issue while running a SoboStrategy campaign using my custom define surrogate. Following code snippet reproduces this error.

from bofire.data_models.surrogates.surrogate import Surrogate as SurrogateDataModel
from bofire.data_models.surrogates.trainable import TrainableSurrogate as TrainableDataModel
import bofire.surrogates.api as surrogates
import bofire.data_models.surrogates.botorch_surrogates as bs

class MySurrogateDataModel(SurrogateDataModel, TrainableDataModel):
    type: Literal["MySurrogate"] = "MySurrogate"

class MySurrogate(Surrogate, TrainableSurrogate):
    def __init__(self, data_model: MySurrogateDataModel, **kwargs):
        super().__init__(data_model=data_model, **kwargs)

surrogates.register(MySurrogateDataModel, MySurrogate)

print(bs.AnyBotorchSurrogate)  # MySurrogate is NOT in the union

Workaround
Temporarily, I found a workaround. Explicitly call register_botorch_surrogate

import bofire.surrogates.api as surrogates
from bofire.data_models.surrogates.botorch_surrogates import register_botorch_surrogate

surrogates.register(MySurrogateDataModel, MySurrogate)
register_botorch_surrogate(MySurrogateDataModel)

But we should fix it.

Probable culprit

# bofire/surrogates/mapper.py
if issubclass(data_model_cls, data_models.BotorchSurrogate):
    register_botorch_surrogate(data_model_cls)  # ← never reached for plain SurrogateDataModel subclasses

@jduerholt
Copy link
Contributor Author

Hmm, but is it actually a bug? The SoboStrategy only works with botorch based surrogates, so your data model should inherit from it, or?

@KislayaRavi
Copy link
Contributor

One can use non Botorch Surrogate models with SoboStrategy, provided the attribute model of the custom defined Surrogate has a function posterior.
The ability of define and register own CustomSurrogate was to provide users ability to include non-standard surrogate. That's why the line if issubclass(data_model_cls, data_models.BotorchSurrogate): may not be necessary.
I was able to run a SoboStrategy with my custom surrogate using register() feature added in this PR. I defined my own posterior function. The register() feature is working. The only thing was using register() feature twice, once for the custom data model and other for the corresponding custom surrogate. I feel that it is redundant.

@jduerholt
Copy link
Contributor Author

Hmm, but why not just inherit in the data model from the BotorchSurrogate? There is no problem in this or?

@bertiqwerty bertiqwerty self-requested a review March 3, 2026 12:08
Copy link
Contributor

@bertiqwerty bertiqwerty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Johannes.

from typing import Union
import typing
from collections.abc import Sequence
from typing import List, Type, Union
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use list instead of List. Don't import List.

AnyEngineeredFeature = Union[tuple(_ENGINEERED_FEATURE_TYPES)]


def register_engineered_feature(data_model_cls: Type[EngineeredFeature]) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe don't clutter the api.py with implementations? Create a separate file and import the function here?

_ENGINEERED_FEATURE_TYPES.append(data_model_cls)
AnyEngineeredFeature = Union[tuple(_ENGINEERED_FEATURE_TYPES)]

# Lazy import to avoid circular dependencies
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see above. if you had separate file for the implementation, would the import still be circular?

# Patch the Sequence[Union[...]] annotation on EngineeredFeatures.features
old = EngineeredFeatures.model_fields["features"].annotation
inner_args = typing.get_args(typing.get_args(old)[0])
if data_model_cls not in inner_args:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this peace of code looks fragile. but currently i don't have a better idea. hmm... at least if this breaks at some point only the registration functinality is affected, not the existing modules.

@@ -1,5 +1,5 @@
from functools import partial
from typing import Union
from typing import List, Type, Union
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see above

AnyPriorConstraint = Union[tuple(_PRIOR_CONSTRAINT_TYPES)]


def _rebuild_dependent_models() -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer a separate file to keep the api.py as clean as possible.

@@ -1,4 +1,4 @@
from typing import Union
from typing import List, Type, Union
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see above

ActualStrategy = Union[tuple(_ACTUAL_STRATEGY_TYPES)]


def register_strategy(data_model_cls: Type[Strategy]) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again don't clutter the api.py

Can be used as a decorator or as a direct function call::

# Decorator form
@register(MyKernelDataModel)
Copy link
Contributor

@bertiqwerty bertiqwerty Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't adding the decorator to existing functions trigger unnecessary rebuild-model calls on import time? Our import time is already pretty long. Further, I find decorators less transparent. If someone wants to register, they could simply use the direct call form. For the existing stuff, I liked the old and stupid dict. And it is somehow incosistent to the data models where we kept the lists of types and extended them. I would prefer to keep the dicts and extend those.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bertiqwerty: But the decorator in general would be fine? So, for registering stuff outside of the library? Or do you want to remove the decorator support in general?

Copy link
Contributor

@bertiqwerty bertiqwerty Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean as option for people to use to add with new code outside of BoFire? There is no way to remove it, is there? The register function can always be used as decorator or with a simple function call, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, so you would like to remove the decorator use from the inner workings of BoFire, correct?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. And I want the dict back.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I will add the dict back in the internal mappings.

Copy link
Contributor

@bertiqwerty bertiqwerty Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, correct. Should be fixed now. My Agent was overlooking this. (last week)

Haha. Nice try blaming the agent. :D

@KislayaRavi
Copy link
Contributor

Thanks Johanes, it works.
I was inheriting from SurrogateDataModel and not TrainableBotorchSurrogate.

@jduerholt
Copy link
Contributor Author

Hi @bertiqwerty,

I implemented the changes, that you suggested, can you have a look again?

Best,

Johannes

@jduerholt jduerholt requested a review from bertiqwerty March 9, 2026 13:33
@bertiqwerty
Copy link
Contributor

Hi @bertiqwerty,

I implemented the changes, that you suggested, can you have a look again?

Best,

Johannes

My reviews are usually YOLO. Just kidding. Thanks for the changes. I will have a look.

Copy link
Contributor

@bertiqwerty bertiqwerty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Johannes.

@LukasHebing
Copy link
Contributor

Thanks a lot @jduerholt,
This functionality was really missing in bofire. I think this is really valuable for new users!

@jduerholt
Copy link
Contributor Author

Thanks @LukasHebing: I will merge it tmr, I think.

jduerholt and others added 2 commits March 12, 2026 19:29
Since _register.py modules use lazy imports inside function bodies,
they can be safely imported at the top of api.py files. Also import
register_strategy directly in strategies/api.py instead of re-exporting
through actual_strategy_type.py.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace manual Sequence[Union[...]] patching with the shared
append_to_union_field utility from _register_utils, consistent
with the other _register.py modules.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@jduerholt jduerholt merged commit a61514b into main Mar 12, 2026
12 checks passed
@jduerholt jduerholt deleted the feature/register branch March 12, 2026 22:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants