HongtaoYang/DAC-tensorflow

About the parameters sensitivity

Opened this issue · 1 comments

@HongtaoYang , I am very grateful for your source code! However, I have found that your implementation is very sensitive to the parameters of the network, such as :

  • In the batch_normalization layer, it must set the trainable=False, because when setting "trainable=True", the results will drop a lot.

  • In the conv2d layer, it must set the padding to vaild, and the results will also drop when setting the way of padding to "same".

So, I feel very strange and can't understand about this phenomenon, because I think the key of this algorithm is not the design of the network structure, it shouldn't be such sensitive to the parameters of the network, could you explain this phenomenon in detail? Thanks for your kindness!

@flexibility2 , it is correct. First the code is sensitive to hyper-parameters and that's why I failed to implement this for so long time until the author released their version, which provides a guideline. It is also correct that the core of the algorithm is not the design of network. However, as with many unsupervised learning algorithms, the network is sensitive to the initial conditions. The core of DAC depends on the assumption that objects of the same class should have more similar representations. If the network is initialised in a way that this assumption does not hold, then the self-reinforcing training process will diverge rapidly.

That is my personal understanding, I guess you can test if the authors' code is also sensitive to such conditions.