Separate optimization methods from CDIModel into separate class Reconstructors#47
Conversation
… class documentation
yoshikisd
left a comment
There was a problem hiding this comment.
I've addressed some of the reviews requested in PR #17.
Comments #17 (comment) and #17 (comment) were addressed in 82ce845.
…ctions that preserves the old behavior when the old pattern is used
… at object creation, and move the dataloader creation logic to the base optimize() function as it was reused in all subclasses
|
Note that I am aware this breaks the checkpointing system, but we will let it break as it is still undocumented and I'd like to update it in any case. see #48 for more info. |
… use in the examples
…tions in the model classes
allevitan
left a comment
There was a problem hiding this comment.
This is awesome, I really like the changes and I want to pull this in ASAP. There were a few specific issues which I wanted to change, the biggest being:
- Handling circular imports in a better way
- Returning to the old behavior when the model._optimize functions are called. I think it will be simpler and more maintainable going forward to do it this way.
- Changing a few details in the instantiation of the classes that I think simplify the structure a bit
- updating the docs and examples
- Making the tests work again
I submitted these changes as a PR to your forked branch. If you have a moment to review those and pull them in to your branch, then I will approve the PR here and we are good to go.
|
Thanks for making the changes Abe! I'm happy that we've gotten rid of the circular import, and I like the changes you've made. I've made a couple comments regarding hinting/documentation in the code; if you can make those changes I'll merge those changes into this branch/PR. |
Co-authored-by: Dayne Yoshiki Sasaki <37006268+yoshikisd@users.noreply.github.com>
Various suggested changes for the reconstructors branch
allevitan
left a comment
There was a problem hiding this comment.
I've rechecked the tests, everything is passing. Ready to merge.
PR Summary
This PR separates ptychography optimization/reconstruction methods from
CDIModelto a new class calledReconstructor.The motivation of this change was to make the CDTools more compatible with other PyTorch-related packages which expect libraries/scripts to follow specific code patterns. Specifically, problems arise if optimization is a method of
torch.nn.module/cdtools.models.CDIModel, which was the case as ofv0.3.0.pypi1. Additionally,Reconstructor's implementation allows us to re-use an optimizer rather than create a brand new one each time a new reconstruction loop is called (as was done with calls tomodel.{Adam, LBFGS, SGD}_optimize)This PR serves to split PR #17, a development PR that ended up creating both
Reconstructorsas well as multi-GPU capability. This PR was created by checking out allReconstructors-related stuff from PR #17 and omits any multi-GPU related code.@allevitan If you can add your reviews from #17, that would be appreciated.
Technical details
Implementation:
cdtools.reconstructorsis largely based on methods fromcdtools.models.CDIModel. For the base classcdtools.reconstructors.Reconstructor, the method(s):_run_epochandoptimizeare based oncdtools.models.CDIModel.AD_Optimize.setup_dataloaderis based on similar blocks of data-loading code present incdtools.models.CDIModel.{Adam, LBFGS, SGD}_optimizeadjust_optimizeris not implemented and must be defined within the subclasses to adjust the optimizer hyperparameters.For the subclasses
cdtools.reconstructors.{Adam, LBFGS, SGD}, the method(s):__init__initializes anoptimizerattribute as atorch.optim.{Adam, LBFGS, SGD}using the model parameters.adjust_optimizerchanges the relevant optimizer hyperparametersoptimizeare based on `cdtools.models.CDIModel.{Adam, LBFGS, SGD}The main change to the code behavior is that the hyperparameters are updated between different reconstruction loops without creating a new
torch.optim.{Adam, LBFGS, SGD}. In some cases, this can result in an improvement in reconstruction quality as discussed in (#17 (comment)).Changes in
cdtools.models.CDIModeland backwards compatibilityThe following changes have been made in
cdtools.models.CDIModelmethods:AD_Optimizehas been removed.{Adam, LBFGS, SGD}_Optimizenow initialize areconstructorattribute as acdtools.reconstructors.{Adam, LBFGS, SGD}and make a call to theiroptimizemethod.The latter change allows scripts written using
v0.3.0.pypi1to be backwards compatible with this update.However, this comes at the cost of creating a circular dependency between
cdtools.models.CDIModelandcdtools.reconstructors. From a reconstruction standpoint it doesn’t seem to cause any issues. However, it’s not clear if this may cause issues in the long term in terms of maintainability. Thus, we should consider deprecating{Adam, LBFGS, SGD}_Optimizein favor of the following code pattern:PyTest
Three tests have been created in test_reconstructors.py which function similar to the tests for the models. These three tests run reconstruction loops using both the traditional method (model.XXX_Optimize) and new method (recon.optimize), but use different optimizers (Adam, LBFGS, SGD), datasets (gold ball dataset for Adam and SGD, siemens star optical data for LBFGS), and models (FancyPtycho for Adam and SGD, RPI for LBFGS). These three tests check the following:
Reconstructor.adjust_optimizerare updating the hyperparametersReconstructorpoints to the original model it was provided as an initialization parameterAdditionally, the first test
test_Adam_gold_ballschecks if data loading methods specific to single-GPU operations are working as intended.