PyTorch implementation
Opened this issue · 3 comments
Hi,
I am trying to implement your code in PyTorch.
I believe I implemented VAT loss accurately. But, I cannot get the same performance probably because I used a different ConvNet. When I try to replicate your convnet; namely: "conv-large" the network did not work at all. Here, I am copying my code for conv-large in PyTorch. I would appreciate if you can give me a feedback on what might be wrong.
Also, in the paper you are referring to the paper "Temporal Ensembling for Semi-Supervised Learning" for the network used in experiments. But, they are adding Gaussian noise in the first layer while I could not find noise in your implementation.
import torch.nn as nn
import torch.nn.functional as F
class conv_large(nn.Module):
def init(self):
super(conv_large, self).init()
self.lr = nn.LeakyReLU(0.1)
self.mp2_2 = nn.MaxPool2d(2, stride=2, padding=0)
self.drop = nn.Dropout(p = 0.5)
self.bn128 = nn.BatchNorm2d(128, affine=True)
self.bn256 = nn.BatchNorm2d(256, affine=True)
self.bn512 = nn.BatchNorm2d(512, affine=True)
self.conv3_128_3_1 = nn.Conv2d(3, 128, kernel_size=3, stride=1, padding=1);
self.conv128_128_3_1 = nn.Conv2d(128, 128, kernel_size=3, stride=1, padding=1);
self.conv128_256_3_1 = nn.Conv2d(128, 256, kernel_size=3, stride=1, padding=1);
self.conv256_256_3_1 = nn.Conv2d(256, 256, kernel_size=3, stride=1, padding=1);
self.conv256_512_3_1 = nn.Conv2d(256, 512, kernel_size=3, stride=1, padding=0);
self.conv512_256_1_1 = nn.Conv2d(512, 256, kernel_size=1, stride=1, padding=0);
self.conv256_128_1_1 = nn.Conv2d(256, 128, kernel_size=1, stride=1, padding=0);
self.avg = nn.AvgPool2d(6, ceil_mode=True) # global average pooling
self.fc = nn.Linear(128, 10)
def forward(self, x):
x = self.conv3_128_3_1(x);
x = self.bn128(x); x = self.lr(x)
x = self.conv128_128_3_1(x);
x = self.bn128(x); x = self.lr(x)
x = self.conv128_128_3_1(x); x = self.bn128(x); x = self.lr(x)
x = self.mp2_2(x);
x = self.drop(x)
x = self.conv128_256_3_1(x);
x = self.bn256(x);
x = self.lr(x)
x = self.conv256_256_3_1(x);
x = self.bn256(x); x = self.lr(x)
x = self.conv256_256_3_1(x);
x = self.bn256(x); x = self.lr(x)
x = self.mp2_2(x);
x = self.drop(x)
x = self.conv256_512_3_1(x);
x = self.bn512(x); x = self.lr(x)
x = self.conv512_256_1_1(x);
x = self.bn256(x); x = self.lr(x)
x = self.conv256_128_1_1(x);
x = self.bn128(x); x = self.lr(x)
x = self.avg(x)
x = x.view(x.size(0),-1)
x = self.fc(x)
return x
Hi,
You use the same instance of nn.BatchNorm2d
within different layers.
I am not familiar with PyTorch implementation of BN well, but I think you should use a different instance of BN for each layer.
That seems to be the mistake. Thanks for your reply.
@reachablesa Hi reachablesa, I want to implement this code in PyTorch, but when i compute the r_vadv, it is always 0,could you show me your Pytorch code? thanks !