Conversation
|
Aren't we going to take at least some amount of performance hit here on kernels you don't lazily evaluate when making predictions, because we don't get to do indexing before calling the kernel? |
gpytorch/kernels/kernel.py
Outdated
|
|
||
| if len(named_parameters): | ||
| param_names, params = zip(*named_parameters) | ||
| param_batch_shapes = [self.batch_shape] * len(params) |
There was a problem hiding this comment.
So being able to use self.batch_shape here relies on the fact that all parameters have a fully explicit shape (rather than broadcasting over them)?
There was a problem hiding this comment.
Yes. I think this is generally true for all kernels in GPyTorch. We can add something to the documentation to be more explicit about this.
| else: | ||
| res = KernelLinearOperator( | ||
| x1_, x2_, covar_func=self.forward, num_outputs_per_input=num_outputs_per_input, **kwargs | ||
| ) |
There was a problem hiding this comment.
We're pretty deep in the indentations here, might make sense to pull out some of the above into helper functions to make it clear what's going on here / easier to read the code
gpytorch/kernels/scale_kernel.py
Outdated
|
|
||
| self.base_kernel = base_kernel | ||
| outputscale = torch.zeros(*self.batch_shape) if len(self.batch_shape) else torch.tensor(0.0) | ||
| outputscale = torch.zeros(*self.batch_shape, 1, 1) if len(self.batch_shape) else torch.zeros(1, 1) |
There was a problem hiding this comment.
Changing the parameter shapes for widely used kernels like this will likely result in a bunch of downstream backward compatibility issues. I'm not necessarily suggestion not to do this, but those changes will need to be properly documented / announced.
There was a problem hiding this comment.
btw, if we're doing this, we should probably combine this with updating the shapes of the priors (c.f. #1317) to make all of that consistent
There was a problem hiding this comment.
I'm going back and forward about changing parameter shapes. On one hand, it could be quite the breaking change. On the other hand, it would create more consistency, and we should also be able to dramatically simplify many kernels as well.
@Balandat thoughts? I imagine this would have the biggest impact on BoTorch.
There was a problem hiding this comment.
Actually, what probably makes the most sense is leaving kernel parameter shapes the way they are currently, and potentially creating consistent parameter shapes and addressing #1317 in a separate PR.
There was a problem hiding this comment.
Makes sense. I think fixing the parameter shape inconsistencies and making things consistent across the board would be good, and hopefully this would also get rid of some of the long standing bugs/issues. This would impact botorch, but if we coordinate on the releases we can make sure that the effect of this is minimized. It would likely generate some short term pain for some power users (of both gpytorch and botorch) with custom setups, but I think that short term pain is probably the long-term gains from consistency.,
There was a problem hiding this comment.
@Balandat That sounds like a good plan. Here's what I'm thinking as a course of action:
- Make a PR that makes kernel parameters consistent sizes (and adds appropriate warnings/deprecations/etc)
- Address [Bug] Sampling from priors doesn't match shape of hyperparameters #1317
- Merge this PR
- (+ other breaking changes that we've wanted to address for a while)
After all of that, we make a major release, timed with a BoTorch release.
@jacobrgardner I don't think it'll lead to a loss in performance. This PR mostly merges LazyEvaluatedKernelTensor and KeOpsLinearOperator, since they share most of the same functionality. |
5fba78d to
6d58b66
Compare
6d58b66 to
963c84c
Compare
**Note**: This is not a breaking change; "legacy" grids were deprecated pre v1.0.
…Kernel - The functionality of both kernels has not disappeared, but both kernels cannot work without the last_dim_is_batch_option. - The examples/00_Basic_Usage/kernels_with_additive_or_product_structure.ipynb notebook describes how to replicate the functionality of both kernels without last_dim_is_batch.
- The functionality of this kernels has not disappeared, but this kernel cannot work without the last_dim_is_batch_option. - The examples/00_Basic_Usage/kernels_with_additive_or_product_structure.ipynb notebook describes how to replicate the functionality of this kernel using the gpytorch.utils.sum_interaction_terms utility.
963c84c to
daf43a3
Compare
With
KernelLinearOperatordefined in the linear operator package,LazyEvaluatedKernelTensorbecomes kind of redundant.This PR attempts to improve the lazy evaluation of kernels in GPyTorch, by doing the following:
LazyEvaluatedKernelTensor, instead usingKernelLinearOperator(which is also used by KeOps kernels)evaluate_kernelform GPyTorch (which will be redundant withKernelLinearOperator) and also removing it from the LinearOperator package.cc/ @Balandat