Skip to content

Conversation

@dlindhol
Copy link
Member

@dlindhol dlindhol commented Jan 9, 2026

The primary goal here is to make projection operations more robust. I added a validate method to encapsulate all the reasons that a projection is not compatible with a dataset's model. One breaking change is that this will not allow a projection for an undefined variable. This is consistent with relational algebra and SQL. I also cleaned up and clarified a few things.

We might want to use this validation pattern for all Operations, calling validate when applying them to a dataset. Otherwise, I've thought about constructing the operations with the model so we can make them safe by construction.

I also paid off some debt by beefing up Tuple construction to disallow elements with duplicate ids and adding a nonIndexScalars method for DataTypes to clean up all the places that filter was being used. And fixed a deprecation.

This is important in the context of Samples which have no Data for an Index
variable. Using filter all over seemed messy and error-prone.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants