[Docs] Stochastic Variational Deep Kernel Learning
foram-si opened this issue · 0 comments
I apologize if this question has been asked before. I am trying to get a better understanding of the batch shape in the GP layer of this example: https://docs.gpytorch.ai/en/latest/examples/06_PyTorch_NN_Integration_DKL/Deep_Kernel_Learning_DenseNet_CIFAR_Tutorial.html
In this example, the GaussianProcessLayer() takes the features from NN layer as input, we have:
features = features.transpose(-1, -2).unsqueeze(-1)
res = self.gp_layer(features)
the feature tensor has a shape of torch.Size([132, 256, 1]), where 132 is the number of NN features. If I understand correctly, the SVDL method will learn a GP for each of these 132 features. So the GP layer returns a MultitaskMultivariateNormal(mean shape: torch.Size([256, 132])).
In the definition of GaussianProcessLayer(), however, there is no batch_shape in mean_module() and covar_module(). batch_shape is only used in variational_distribution.
I looked through the tutorials but could not find an explanation of how the batch shape is handled in this case, and how do I get a MultitaskMultivariateNormal(mean shape: torch.Size([256, 132])) from a tensor input with shape of torch.Size([132, 256, 1]). Can anyone help me out?
Thank you!