-
Notifications
You must be signed in to change notification settings - Fork 1
Middleware no longer saves datamodels to cache; left to pipeline. #4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
is this assuming that the calibration is already in the cache folder? I think the expected interface is I give the calibration store either a fits file or a data model and the store handles copying or saving the file into the right folder automatically. Also, if given a fits file the store should read it, add proper metadata before saving it into the cache. |
|
I think the appropriate order is:
I'm not totally opposed to the Store calling If given only a filename, I would require the FITS file already has all of the necessary metadata populated (at least for HISPEC_DRP) and is correctly formatted otherwise, in which case a file copy is fine. |
|
@astrobc1 I added a simple test, please check if it makes sense. what I am worried about is people sharing their local db with paths that only work on their machines. |
|
Let me know if I'm misunderstanding - The DB itself shouldn't contain local paths (only remote/archival paths). Sharing an entire local cache folder will work since the full path to a calibration file is specified by the environment variable There is still a relative path within the cache that needs formed (see here), so there is an assumption regarding how the cache folder is structured (see here). |
674433e to
16fb1ab
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@astrobc1 do you think this test shows a reasonable way of using the calibration store? can you make it more realistic? I think it would help me understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think test_store.py and test_koa_middleware_integration.py are both great tests and test similar functionality as HISPEC's corresponding test.
Looking closer at how test_store.py calls query_by_id, should it instead return first() instead of all()? I think it's safe to assume "id" is always the primary key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, let's look at test_store.py.
So the user has created their own flat named mock_cal, they want to use it for many pipelines in the future, so they want to save it into the calibration db so it is automatically retrieved given the proper query parameters.
This is what I would expect:
-
I create an object that has both the data array of my flat and metadata
I guess this should be CalibrationTestORM, I see the metadata, but where is the data array stored?
-
I pass this to some register calibration method which adds a row to my calibration db and creates a FITS file of my flat inside the cache so in the future it is retrieved from there
This should be
store.register_local_calibration, but now you want to remove auto-saving the file? Am I supposed to save the file somewhere myself? and then you trust me to give you the right path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are the two possible patterns/interfaces as I see it:
-
The DRP is responsible for saving the calibration FITS file (data + metadata) to the cache by calling
model.save(). The DRP is also responsible for constructing the ORM object by callingmodel.to_orm(), and passing only that ORM object tostore.register_local_calibration()which will add the metadata (ORM object) to the SQLite DB. -
The DRP passes the data model directly to
store.register_local_calibration(). The store callsmodel.save()to save the file to the cache, and then the store callsmodel.to_orm()and adds that ORM object to the local SQLite DB.
I agree option 2 is the simpler pattern, and if we can find a solution closer to that, I'm for it. The reason I like option 1 is because of dependencies/expectations of the middleware.
In option 1, the middleware doesn't care or need to know what a datamodel is.
In option 2, the middleware at minimum needs to know that a data model has model.save() and model.to_orm(). I could also be convinced the best solution is simply to enforce that as an interface requirement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An alternative approach is to consider how an instrument-specific middleware (e.g. liger_middleware) could solve this (I know this was the original plan). For example, a less stringent requirement would be that liger_middleware understands Liger data models.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like pattern 1, the cache is part of the middleware, the user should not mess with that.
I would go with 2 and an interface requirement, a save and a to_orm methods seem easy to understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, then I'm fine with closing the PR and keeping main as is.
@zonca This is only a minor change. Currently the CalibrationStore is responsible for calling
model.save(), but I would rather have the DRP be responsible for this. This way, the DRP only passes pythondict's and ORM objects, both of which it understands. In the HISPEC DRP, this would mean adding something likemodel.save()to RegisterCalibrationPrimitive between lines 25-26.This would technically break HISPEC_DRP until I make the corresponding change there. How do you recommend I handle cases like this?