Entropy model of hyper-latents
Closed this issue · 1 comments
Hi,
What should I do if I want to use zero-mean gaussian Entropy Parameters for hyper-latents(like in y
)?
To do that, Can I use the self.gaussian_conditional
function as it is?
For example, if I use ScaleHyperprior
model, the code below.
(It's all the same as the existing code, and I only added the scale_z
part.)
class ScaleHyperprior(CompressionModel):
def __init__(self, N, M, **kwargs):
super().__init__(entropy_bottleneck_channels=N, **kwargs)
self.g_a = nn.Sequential(
conv(3, N),
GDN(N),
conv(N, N),
GDN(N),
conv(N, N),
GDN(N),
conv(N, M),
)
self.g_s = nn.Sequential(
deconv(M, N),
GDN(N, inverse=True),
deconv(N, N),
GDN(N, inverse=True),
deconv(N, N),
GDN(N, inverse=True),
deconv(N, 3),
)
self.h_a = nn.Sequential(
conv(M, N, stride=1, kernel_size=3),
nn.ReLU(inplace=True),
conv(N, N),
nn.ReLU(inplace=True),
conv(N, N),
)
self.h_s = nn.Sequential(
deconv(N, N),
nn.ReLU(inplace=True),
deconv(N, N),
nn.ReLU(inplace=True),
conv(N, M, stride=1, kernel_size=3),
nn.ReLU(inplace=True),
)
self.gaussian_conditional = GaussianConditional(None)
self.N = int(N)
self.M = int(M)
@property
def downsampling_factor(self) -> int:
return 2 ** (4 + 2)
def forward(self, x):
y = self.g_a(x)
z = self.h_a(torch.abs(y))
# z_hat, z_likelihoods = self.entropy_bottleneck(z)
scale_z = torch.abs(torch.ones(z.shape, device='cuda'))
z_hat, z_likelihoods = self.gaussian_conditional(z, scale_z)
scales_hat = self.h_s(z_hat)
y_hat, y_likelihoods = self.gaussian_conditional(y, scales_hat)
x_hat = self.g_s(y_hat)
return {
"x_hat": x_hat,
"likelihoods": {"y": y_likelihoods, "z": z_likelihoods},
}
If possible, we should revise def forward
, def compress
, etc..
I think if this is possible, we can use it in the same way as y
...
Is the above method valid? (If the above code is used for training, execution is possible.)
Or I would appreciate it if you could let me know the link to refer to.
I'm sorry it's a stupid question that's hard to answer.
Thanks.
hi @hyeseojy,
You are proposing a different architecture than the paper we refer to here. Have you trained it and compare results?
I think it's more a "discussion" post than an issue. To be honest, I think you should modify your entropy bottleneck instead of using gaussian_conditional, since you actually don't condition your entropy models based on another prior like for gaussian_conditional(y, scales_hat). Please let us know in discussion if your model works and outperforms this one. If it's published, we could add it. Thanks