[Docs] Stochastic Variational Deep Kernel Learning

Question

[Docs] Stochastic Variational Deep Kernel Learning

foram-si opened this issue 5 months ago · 0 comments

I apologize if this question has been asked before. I am trying to get a better understanding of the batch shape in the GP layer of this example: https://docs.gpytorch.ai/en/latest/examples/06_PyTorch_NN_Integration_DKL/Deep_Kernel_Learning_DenseNet_CIFAR_Tutorial.html

In this example, the GaussianProcessLayer() takes the features from NN layer as input, we have:

features = features.transpose(-1, -2).unsqueeze(-1)
res = self.gp_layer(features)

the feature tensor has a shape of torch.Size([132, 256, 1]), where 132 is the number of NN features. If I understand correctly, the SVDL method will learn a GP for each of these 132 features. So the GP layer returns a MultitaskMultivariateNormal(mean shape: torch.Size([256, 132])).

In the definition of GaussianProcessLayer(), however, there is no batch_shape in mean_module() and covar_module(). batch_shape is only used in variational_distribution.

I looked through the tutorials but could not find an explanation of how the batch shape is handled in this case, and how do I get a MultitaskMultivariateNormal(mean shape: torch.Size([256, 132])) from a tensor input with shape of torch.Size([132, 256, 1]). Can anyone help me out?

Thank you!