Skip to content

Add flepimop2.axis module for realized axes#208

Open
TimothyWillard wants to merge 1 commit intomainfrom
axis-module
Open

Add flepimop2.axis module for realized axes#208
TimothyWillard wants to merge 1 commit intomainfrom
axis-module

Conversation

@TimothyWillard
Copy link
Copy Markdown
Collaborator

Added the axis module for realizing and manipulating axes from configuration. Includes:

  • ResolvedShape as a container for metadata about the shape of an object based on its axes.
  • Axis as a data class for a realized axis that objects which need axes can use to do their calculations.
  • AxisCollection as a mapping of Axis objects that simplify access when multiple axes are needed.
  • Dropped deprecation warning from using the axes attribute of ConfigurationModel.

Follow up to #147, #164.

Copy link
Copy Markdown
Contributor

@jc-macdonald jc-macdonald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. One docstring nit below.

Comment thread src/flepimop2/axis.py
... domain=(1e-9, 1e-3),
... spacing="log",
... ).points()
... )
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Notes block claims points are computed via np.linspace(…, endpoint=False) / np.geomspace(…, endpoint=False), but the implementation computes bin centroids from size + 1 edges — arithmetic midpoints for linear (0.5 * (edges[:-1] + edges[1:])), geometric means for log (np.sqrt(edges[:-1] * edges[1:])).

These produce different values. E.g. for domain=(0, 8), size=4:

  • linspace(0, 8, 4, endpoint=False)(0.0, 2.0, 4.0, 6.0)
  • Bin midpoints (actual) → (1.0, 3.0, 5.0, 7.0)

The implementation is the better choice (cell-centered points are standard for FVM-style discretization and align naturally with bins()), but the docstring should describe what the code does. Something like:

Points are the centroids of the bins returned by bins(): arithmetic midpoints for linear spacing, geometric means for log spacing.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah good catch, I had originally used linspace/geomspace but realized the behavior was not quite right and forgot to update the docstring accordingly. Fixed in https://github.com/ACCIDDA/flepimop2/compare/6bfe49d13e7d0f0a5bfe683775b5fb382226f867..64e0a2b2f99b46aff9ff73c953d236edd1ddaba8

Copy link
Copy Markdown
Member

@pearsonca pearsonca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this may be fine for now, but copying my slack thoughts here:

  • we have top level config specification axes - that basically says here are other dimensions that models must somehow implement (Categorical and Continuous), and that parameters may vary over. These are pydantic types for easy deserialization
  • my understanding of this PR is that we are doing the first pass at expressing how systems & parameters might meet those specifications - so the high level axes would not use these realizations, but systems, parameters, etc would as part of their specification
  • these don't seem to be pydantic types; what's the plan for how things that need realizations create them? If I'm a system or parameter module provider, how do I have my config approach "just work"?
  • In terms of design: my mind jumps to thinking that the concrete axes for implementors would be subclasses of their overarching types - not insisting on that design, but am curious if y'all though thru that angle and hit some dead end? As in, might have class ManualBinAxis(ContinuousAxis) vs LinearBinAxis, LogBinAxis, DensityBinned, etc. (Categorical is probably less interesting, but even there could have all categories distinct, some grouped, etc).
  • Mostly thinking about that from two angles - the concrete axes also need to do abstract axes tasks (e.g. report domain) along with ones that specification-axes don't (e. g. how is that domain divided). Second, if they work that way, then checking if concrete axes match specification axes becomes easy (e.g. is domain covered)
  • I am somewhat ambivalent about not modularizing axis representation. This approach avoids modularizing, which differs from how we approach other features - but that's probably the only downside I see. I think that's fine as modularization seems complicated (like what's the API?), and i'm not sure what other implementations might be even vaguely interesting (so limited value to make it modular). Also seems we can do it later if need be.

As I said, fine for now, but might want to directionally shift in response to those concerns

Comment thread src/flepimop2/axis.py
Comment on lines +90 to +103
def _continuous_domain(self) -> tuple[float, float]:
"""
Validate Continuous Axis Metadata.

Ensure this axis can support continuous point and bin helpers.

Returns:
The continuous axis domain as `(lower, upper)`.

Raises:
TypeError: If this is not a continuous axis.
ValueError: If the continuous axis is missing domain or spacing
metadata.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feels like this should not raise a value error - rather, an error should occur at construction time?

Added the `axis` module for realizing and manipulating axes from
configuration. Includes:

- `ResolvedShape` as a container for metadata about the shape of an
  object based on its axes.
- `Axis` as a data class for a realized axis that objects which need
  axes can use to do their calculations.
- `AxisCollection` as a mapping of `Axis` objects that simplify access
  when multiple axes are needed.
- Dropped deprecation warning from using the `axes` attribute of
  `ConfigurationModel`.

Follow up to #147, #164.
@TimothyWillard
Copy link
Copy Markdown
Collaborator Author

I think this may be fine for now, but copying my slack thoughts here:

  • we have top level config specification axes - that basically says here are other dimensions that models must somehow implement (Categorical and Continuous), and that parameters may vary over. These are pydantic types for easy deserialization
  • my understanding of this PR is that we are doing the first pass at expressing how systems & parameters might meet those specifications - so the high level axes would not use these realizations, but systems, parameters, etc would as part of their specification
  • these don't seem to be pydantic types; what's the plan for how things that need realizations create them? If I'm a system or parameter module provider, how do I have my config approach "just work"?
  • In terms of design: my mind jumps to thinking that the concrete axes for implementors would be subclasses of their overarching types - not insisting on that design, but am curious if y'all though thru that angle and hit some dead end? As in, might have class ManualBinAxis(ContinuousAxis) vs LinearBinAxis, LogBinAxis, DensityBinned, etc. (Categorical is probably less interesting, but even there could have all categories distinct, some grouped, etc).
  • Mostly thinking about that from two angles - the concrete axes also need to do abstract axes tasks (e.g. report domain) along with ones that specification-axes don't (e. g. how is that domain divided). Second, if they work that way, then checking if concrete axes match specification axes becomes easy (e.g. is domain covered)
  • I am somewhat ambivalent about not modularizing axis representation. This approach avoids modularizing, which differs from how we approach other features - but that's probably the only downside I see. I think that's fine as modularization seems complicated (like what's the API?), and i'm not sure what other implementations might be even vaguely interesting (so limited value to make it modular). Also seems we can do it later if need be.

As I said, fine for now, but might want to directionally shift in response to those concerns

So I think there are two lines of thinking in this comment:

  1. You are correct to point out that these are not pydantic objects. We have so far made the decision to not ask module developers to use pydantic in their own development, see Deprecate non-pydantic approach to implementing modules #156. I think this configuration vs representation element could be greatly simplified if we did. In fact I suppose since external module providers have to import flepimop2 it's not as burdensome to ask them to consume a pydantic object vs implement one so perhaps there's a good argument for consolidating this into the representation itself. But in terms of how to convert the configuration object to one of these objects right now there are the from_model/from_config class methods.
  2. Had not thought that through very throughly. My understanding coming out of Add axes top level key to configuration format #164 was that a large number of axis types was not ideal and we wanted to be limited to continuous vs non-continuous. If we want to support multiple axis types I think my recommendation would be to move towards modularization, perhaps can then think through how to approach continuous vs non-continuous (if that's the right framing) axes as a part of that. But I agree, the API for that is very open ended and would have to figure out how to indicate what modules implement which properties in a way that is not tedious for consumers to deal with.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants