danxuhk/StructuredAttentionDepthEstimation

the difference of trainning model and testing model

junfengcao opened this issue · 10 comments

I found the "atten_f{}_mf{}"layers' output channels is 1 in trainning model ,but in testing model those layers' output number is 512. What I want to ask is that how I should deal with the Tile layer in trainning model .

Have u used 'python gen_deploy_prototxt.py' to generate the testing prototxt as we mentioned in the ReadMe?

Yes,I have generated the testing prototxt ,but why are there some differences in the part of CRF model between testing prototxt and trainning prototxt.For example ,the kernel's dimension of the atten_f1_mf1 layer is (1,1024,3,3) in trainning model, but in testing model the kernel's dimension of this layer is (512,1024,3,3). So I can‘t load my training parameters into the testing prototxt. Thank you !

I tested the code on my side, and i did not observe this situation. And the output from the testing prototxt has the same filter size as the training. However, the batch size in testing is set to 1. If you compare the gen_deploy_prototxt.py and the gen_train_prototxt.py, you would find the kernel size are the same. The difference is the input batch size.

In the "MeanFieldUpdate" function of gen_train_prototxt.py:
n[atten_f] = L.Convolution(n[concat_f], num_output=1, kernel_size=3, stride=1, pad=1, param=[dict(name='atten_f{}_w'.format(feat_ind), lr_mult=1, decay_mult=1), dict(name='atten_f{}_b'.format(feat_ind), lr_mult=2, decay_mult=0)]).
n[norm_atten_f] = L.Sigmoid(n[atten_f])
n[norm_atten_f_tile] = L.Tile(net[norm_atten_f], tile_param=dict(axis=1, tiles=feat_num))

In the gen_deploy_prototxt.py:
n[atten_f] = L.Convolution(n[concat_f], num_output=feat_num, kernel_size=3, stride=1, pad=1, param=[dict(name='atten_f{}_w'.format(feat_ind), lr_mult=1, decay_mult=1), dict(name='atten_f{}_b'.format(feat_ind), lr_mult=2, decay_mult=0)])

As I seen in the github, I can't understand the difference between gen_train_prototxt.py with the gen_deploy_prototxt.py.

The difference is in the testing prototxt, we removed the intermediate predictions and their corresponding loss. The input data layer is different: in the training, the input data layer is a python data layer which supports data augmentation, while in the testing, the input data layer is a simple image data layer. For the network structure part, everything is the same, so it is impossible that you would have met kernel size unmatching issue.

(1,1024,3,3) and (512,1024,3,3) only have the first dimension different, while the first dimension is related to the batch size, not the kernel size. And I do not know why the batch size is 512 in training ...

Both of the tuples we mentioned are the dimensions of convolution kernels. ---"(1,1024,3,3) and (512,1024,3,3) " {atten_f1_mf1 layer }
And I think that the convolution kernel's dimension is [out_channels , in_channels ,kernel_size, kernel_size], so 512 is this layer's output channels (name is feat_num in your code ) which equals to the number of convolution kernels rather than the batch_size.
So I have a bold request, could you send me your prototxt files that you generated? My Email address is
junfeng_cao@bupt.edu.cn
Thank you very much !!

@junfengcao I have sent the deploy prototxt to your email address. Please let me know if it works for you...

@danxuhk I found that those layers named norm_atten_f_tile{}_mf{} only existed in train model. And the number of outputs in atten_f{}_mf{} layers are different (num_output = 1 in train model and num_output = 512 in test model) . Could you tell me why ? Thank you !

@junfengcao I think the training protoxt generator is an old one. I replaced the gen_train_prototxt.py. Now it should be ok.

Ok,thank you very much! @danxuhk