taki0112/Densenet-Tensorflow

A bug in transition layer

NatalieZou opened this issue · 29 comments

the original filter in transition_layer is equal to the growth_k, which is too small ,so the result is not good ,and the network is hard to converge , so I change it as below , referring to another code , and the result is normal now and much more better.
def transition_layer(self, x, scope):
with tf.name_scope(scope):
x = Batch_Normalization(x, training=self.training, scope=scope+'_batch1')
x = Relu(x)
shape = x.get_shape().as_list()
in_channel = shape[3]
#x = conv_layer(x, filter=self.filters, kernel=[1,1], layer_name=scope+'_conv1')
x = conv_layer(x, filter=in_channel*0.5, kernel=[1,1], layer_name=scope+'_conv1')
x = Drop_out(x, rate=dropout_rate, training=self.training)
x = Average_pooling(x, pool_size=[2,2], stride=2)

Hi NatalieZou I didn't notice that, I will use it and see the the results, could we open a discussion about this code here?

NatalieZou could you please post your training loss curve here after fixing that bug?

thank you for your information ,i will change the parameter, and try it again.

@Abdelpakey ok, of course you can. This is my training acc and loss curve:
densenet_acc

densenet_loss

@NatalieZou thank you for your post, can you tell me about the K and L, dropout rate, etc values of your loss curve?

@GyuminDev In original paper, there is a hyper-parameter called compression factor. This factor is used to decimate the tensor which is feed from a dense block to a transition layer. In my opinion, you can try the K and L values mentioned in paper. The problem in this issue is not so related to these two values. Good luck, hope your model works well.

Yes, I meet this problem too. Thank you.

@Abdelpakey ok, of course you can. This is my training acc and loss curve:
densenet_acc

densenet_loss

How many densenet blocks do you use? How about the depth and growth rate?

@yuffon It's all the same with the original code, the only part I change is "x = conv_layer(x, filter=in_channel*0.5, kernel=[1,1], layer_name=scope+'_conv1')"

@yuffon It's all the same with the original code, the only part I change is "x = conv_layer(x, filter=in_channel*0.5, kernel=[1,1], layer_name=scope+'_conv1')"

Why can't I?
I use tensorflow1.10, cuda 9.0, 1080ti.
What optimizer do you use?

@yuffon It's all the same with the original code, the only part I change is "x = conv_layer(x, filter=in_channel*0.5, kernel=[1,1], layer_name=scope+'_conv1')"

Why can't I?
I use tensorflow1.10, cuda 9.0, 1080ti.
What optimizer do you use?

I used MomentumOptimizer, it's annotated in the code.

@yuffon It's all the same with the original code, the only part I change is "x = conv_layer(x, filter=in_channel*0.5, kernel=[1,1], layer_name=scope+'_conv1')"

Why can't I?
I use tensorflow1.10, cuda 9.0, 1080ti.
What optimizer do you use?

I used MomentumOptimizer, it's annotated in the code.

thank you very much.
How about the lr decay?
I have used adam as the optimizer, and the intial lr as 1e-4. This is a bad configuration.

@yuffon It's all the same with the original code, the only part I change is "x = conv_layer(x, filter=in_channel*0.5, kernel=[1,1], layer_name=scope+'_conv1')"

Why can't I?
I use tensorflow1.10, cuda 9.0, 1080ti.
What optimizer do you use?

I used MomentumOptimizer, it's annotated in the code.

thank you very much.
How about the lr decay?
I have used adam as the optimizer, and the intial lr as 1e-4. This is a bad configuration.

I used MomentumOptimizer and the init_learning_rate is 1e-1

@yuffon It's all the same with the original code, the only part I change is "x = conv_layer(x, filter=in_channel*0.5, kernel=[1,1], layer_name=scope+'_conv1')"

Why can't I?
I use tensorflow1.10, cuda 9.0, 1080ti.
What optimizer do you use?

I used MomentumOptimizer, it's annotated in the code.

thank you very much.
How about the lr decay?
I have used adam as the optimizer, and the intial lr as 1e-4. This is a bad configuration.

I used MomentumOptimizer and the init_learning_rate is 1e-1

Could you please send me a copy of your training script?
I really don't know why.

@yuffon It's all the same with the original code, the only part I change is "x = conv_layer(x, filter=in_channel*0.5, kernel=[1,1], layer_name=scope+'_conv1')"

Why can't I?
I use tensorflow1.10, cuda 9.0, 1080ti.
What optimizer do you use?

I used MomentumOptimizer, it's annotated in the code.

thank you very much.
How about the lr decay?
I have used adam as the optimizer, and the intial lr as 1e-4. This is a bad configuration.

I used MomentumOptimizer and the init_learning_rate is 1e-1

I see that your training acc can reach 90% in less than 10 epochs. The test acc also reaches 85% in 10 epochs. This is far beyond the results on my computer.

@yuffon It's all the same with the original code, the only part I change is "x = conv_layer(x, filter=in_channel*0.5, kernel=[1,1], layer_name=scope+'_conv1')"

Why can't I?
I use tensorflow1.10, cuda 9.0, 1080ti.
What optimizer do you use?

I used MomentumOptimizer, it's annotated in the code.

thank you very much.
How about the lr decay?
I have used adam as the optimizer, and the intial lr as 1e-4. This is a bad configuration.

I used MomentumOptimizer and the init_learning_rate is 1e-1

I see that your training acc can reach 90% in less than 10 epochs. The test acc also reaches 85% in 10 epochs. This is far beyond the results on my computer.

Could you give me your email, the .py file can't be send in github

@yuffon It's all the same with the original code, the only part I change is "x = conv_layer(x, filter=in_channel*0.5, kernel=[1,1], layer_name=scope+'_conv1')"

Why can't I?
I use tensorflow1.10, cuda 9.0, 1080ti.
What optimizer do you use?

I used MomentumOptimizer, it's annotated in the code.

thank you very much.
How about the lr decay?
I have used adam as the optimizer, and the intial lr as 1e-4. This is a bad configuration.

I used MomentumOptimizer and the init_learning_rate is 1e-1

I see that your training acc can reach 90% in less than 10 epochs. The test acc also reaches 85% in 10 epochs. This is far beyond the results on my computer.

Could you give me your email, the .py file can't be send in github

my email is yuffonzhang@163.com
Thanks a lot.
The acc really sucks on my computer.

@NatalieZou As you said, I fixed the code.
@yuffon Please check.

@yuffon It's all the same with the original code, the only part I change is "x = conv_layer(x, filter=in_channel*0.5, kernel=[1,1], layer_name=scope+'_conv1')"

Why can't I?
I use tensorflow1.10, cuda 9.0, 1080ti.
What optimizer do you use?

I used MomentumOptimizer, it's annotated in the code.

thank you very much.
How about the lr decay?
I have used adam as the optimizer, and the intial lr as 1e-4. This is a bad configuration.

I used MomentumOptimizer and the init_learning_rate is 1e-1

I see that your training acc can reach 90% in less than 10 epochs. The test acc also reaches 85% in 10 epochs. This is far beyond the results on my computer.

Could you give me your email, the .py file can't be send in github

my email is yuffonzhang@163.com
Thanks a lot.
The acc really sucks on my computer.

thank you.
I have tried deep densenet on my computer.
It reaches 93.78%.
But when I use densenet40-12, the acc is not good.
Have you tried some shallower densenet(such as densenet40-12)?

@NatalieZou As you said, I fixed the code.
@yuffon Please check.

Thank you very much.
I checked the data processing code.
The standardization is performed on the total data set.
np.mean(x_train[:, :, :, 0]) computes the mean of the first channel of the whole data set.
I have tried per image standardization using tf.dataset API, the acc cannot reach 91%.
Why?

@NatalieZou @taki0112
When I use this code, the program is reported as follows:

Traceback (most recent call last):
File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 184, in
logits = DenseNet(x=batch_images, nb_blocks=nb_block, filters=growth_k, training=training_flag).model
File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 84, in init
self.model = self.Dense_net(x)
File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 146, in Dense_net
x = self.transition_layer(x, scope='trans_'+str(i))
File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 113, in transition_layer
x = conv_layer(x, filter=in_channel*0.5, kernel=[1,1], layer_name=scope+'_conv1')
TypeError: unsupported operand type(s) for *: 'Dimension' and 'float'

I think this may be that in_channel is not an integer after 0.5.
When I change it to
x = conv_layer(x, filter=in_channel
1, kernel=[1,1], layer_name=scope+'_conv1'),
the network can run.

Then I use
Print(x)
Print(in_channel) to check their size。
I found that x and in_channel have two values.
Tensor("trans_0/Relu:0", shape=(?, 6, 6, 72), dtype=float32)
72
Tensor("trans_1/Relu:0", shape=(?, 3, 3, 120), dtype=float32)
120

From the above it seems that *0.5 should also be satisfied, and x and in_channel should also have only one value?
I want to know why this problem happened?

@NatalieZou @taki0112
When I use this code, the program is reported as follows:

Traceback (most recent call last):
File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 184, in
logits = DenseNet(x=batch_images, nb_blocks=nb_block, filters=growth_k, training=training_flag).model
File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 84, in init
self.model = self.Dense_net(x)
File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 146, in Dense_net
x = self.transition_layer(x, scope='trans_'+str(i))
File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 113, in transition_layer
x = conv_layer(x, filter=in_channel*0.5, kernel=[1,1], layer_name=scope+'_conv1')
TypeError: unsupported operand type(s) for *: 'Dimension' and 'float'

I think this may be that in_channel is not an integer after _0.5. When I change it to x = conv_layer(x, filter=in_channel_1, kernel=[1,1], layer_name=scope+'_conv1'),
the network can run.

Then I use
Print(x)
Print(in_channel) to check their size。
I found that x and in_channel have two values.
Tensor("trans_0/Relu:0", shape=(?, 6, 6, 72), dtype=float32)
72
Tensor("trans_1/Relu:0", shape=(?, 3, 3, 120), dtype=float32)
120

From the above it seems that *0.5 should also be satisfied, and x and in_channel should also have only one value?
I want to know why this problem happened?

yes,I also have this problem, I run Dense_Cifar10.py, The program reports as follow:
Traceback (most recent call last):
File "Densenet_Cifar10.py", line 221, in
logits = DenseNet(x=input_x, nb_blocks=nb_block, filters=growth_k, training=training_flag).model
File "Densenet_Cifar10.py", line 111, in init
self.model = self.Dense_net(x)
File "Densenet_Cifar10.py", line 180, in Dense_net
x = self.transition_layer(x, scope='trans_1')
File "Densenet_Cifar10.py", line 140, in transition_layer
x = conv_layer(x, filter=in_channel*0.5, kernel=[1,1], layer_name=scope+'_conv1')
TypeError: unsupported operand type(s) for *: 'Dimension' and 'float'

do you solve the problem??

@Demons-git @NatalieZou @taki0112
When I use this code, the program is reported as follows:
Traceback (most recent call last):
File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 184, in
logits = DenseNet(x=batch_images, nb_blocks=nb_block, filters=growth_k, training=training_flag).model
File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 84, in init
self.model = self.Dense_net(x)
File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 146, in Dense_net
x = self.transition_layer(x, scope='trans_'+str(i))
File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 113, in transition_layer
x = conv_layer(x, filter=in_channel*0.5, kernel=[1,1], layer_name=scope+'_conv1')
TypeError: unsupported operand type(s) for *: 'Dimension' and 'float'
I think this may be that in_channel is not an integer after _0.5. When I change it to x = conv_layer(x, filter=in_channel_1, kernel=[1,1], layer_name=scope+'_conv1'),
the network can run.
Then I use
Print(x)
Print(in_channel) to check their size。
I found that x and in_channel have two values.
Tensor("trans_0/Relu:0", shape=(?, 6, 6, 72), dtype=float32)
72
Tensor("trans_1/Relu:0", shape=(?, 3, 3, 120), dtype=float32)
120
From the above it seems that *0.5 should also be satisfied, and x and in_channel should also have only one value?
I want to know why this problem happened?

yes,I also have this problem, I run Dense_Cifar10.py, The program reports as follow:
Traceback (most recent call last):
File "Densenet_Cifar10.py", line 221, in
logits = DenseNet(x=input_x, nb_blocks=nb_block, filters=growth_k, training=training_flag).model
File "Densenet_Cifar10.py", line 111, in init
self.model = self.Dense_net(x)
File "Densenet_Cifar10.py", line 180, in Dense_net
x = self.transition_layer(x, scope='trans_1')
File "Densenet_Cifar10.py", line 140, in transition_layer
x = conv_layer(x, filter=in_channel*0.5, kernel=[1,1], layer_name=scope+'_conv1')
TypeError: unsupported operand type(s) for *: 'Dimension' and 'float'

Hello,
I also encountered this problem, I changed this line of code so that I can run it.
in_channel = x.shape[-1]
Changed to
in_channel = x.get_shape().as_list()[-1]

@NatalieZou @taki0112
When I use this code, the program is reported as follows:
Traceback (most recent call last):
File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 184, in
logits = DenseNet(x=batch_images, nb_blocks=nb_block, filters=growth_k, training=training_flag).model
File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 84, in init
self.model = self.Dense_net(x)
File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 146, in Dense_net
x = self.transition_layer(x, scope='trans_'+str(i))
File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 113, in transition_layer
x = conv_layer(x, filter=in_channel*0.5, kernel=[1,1], layer_name=scope+'_conv1')
TypeError: unsupported operand type(s) for *: 'Dimension' and 'float'
I think this may be that in_channel is not an integer after _0.5. When I change it to x = conv_layer(x, filter=in_channel_1, kernel=[1,1], layer_name=scope+'_conv1'),
the network can run.
Then I use
Print(x)
Print(in_channel) to check their size。
I found that x and in_channel have two values.
Tensor("trans_0/Relu:0", shape=(?, 6, 6, 72), dtype=float32)
72
Tensor("trans_1/Relu:0", shape=(?, 3, 3, 120), dtype=float32)
120
From the above it seems that *0.5 should also be satisfied, and x and in_channel should also have only one value?
I want to know why this problem happened?

yes,I also have this problem, I run Dense_Cifar10.py, The program reports as follow:
Traceback (most recent call last):
File "Densenet_Cifar10.py", line 221, in
logits = DenseNet(x=input_x, nb_blocks=nb_block, filters=growth_k, training=training_flag).model
File "Densenet_Cifar10.py", line 111, in init
self.model = self.Dense_net(x)
File "Densenet_Cifar10.py", line 180, in Dense_net
x = self.transition_layer(x, scope='trans_1')
File "Densenet_Cifar10.py", line 140, in transition_layer
x = conv_layer(x, filter=in_channel*0.5, kernel=[1,1], layer_name=scope+'_conv1')
TypeError: unsupported operand type(s) for *: 'Dimension' and 'float'

do you solve the problem??

I also change code to run it at same place but not same change.
in_channel = x.shape[-1]
Changed to
in_channel = int(x.shape[-1])

@yuffon It's all the same with the original code, the only part I change is "x = conv_layer(x, filter=in_channel*0.5, kernel=[1,1], layer_name=scope+'_conv1')"

Why can't I?
I use tensorflow1.10, cuda 9.0, 1080ti.
What optimizer do you use?

I used MomentumOptimizer, it's annotated in the code.

thank you very much.
How about the lr decay?
I have used adam as the optimizer, and the intial lr as 1e-4. This is a bad configuration.

I used MomentumOptimizer and the init_learning_rate is 1e-1

I see that your training acc can reach 90% in less than 10 epochs. The test acc also reaches 85% in 10 epochs. This is far beyond the results on my computer.

Could you give me your email, the .py file can't be send in github

my email is yuffonzhang@163.com
Thanks a lot.
The acc really sucks on my computer.

Hello, I also encountered a similar trouble that the valid accuracy is not high, I am using Adam optimization, can you send a training code to my mailbox? Wlshi111@aliyun.com Thank you!

@NatalieZou @taki0112
When I use this code, the program is reported as follows:

Traceback (most recent call last):
File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 184, in
logits = DenseNet(x=batch_images, nb_blocks=nb_block, filters=growth_k, training=training_flag).model
File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 84, in init
self.model = self.Dense_net(x)
File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 146, in Dense_net
x = self.transition_layer(x, scope='trans_'+str(i))
File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 113, in transition_layer
x = conv_layer(x, filter=in_channel*0.5, kernel=[1,1], layer_name=scope+'_conv1')
TypeError: unsupported operand type(s) for *: 'Dimension' and 'float'

I think this may be that in_channel is not an integer after _0.5. When I change it to x = conv_layer(x, filter=in_channel_1, kernel=[1,1], layer_name=scope+'_conv1'),
the network can run.

Then I use
Print(x)
Print(in_channel) to check their size。
I found that x and in_channel have two values.
Tensor("trans_0/Relu:0", shape=(?, 6, 6, 72), dtype=float32)
72
Tensor("trans_1/Relu:0", shape=(?, 3, 3, 120), dtype=float32)
120

From the above it seems that *0.5 should also be satisfied, and x and in_channel should also have only one value?
I want to know why this problem happened?

I also use * 0.5, but the code doesn't work. You can try filter = in_channel/2. The effect is the same, and print (x) and print (in_channel) are the same.

@NatalieZou @taki0112
When I use this code, the program is reported as follows:

Traceback (most recent call last):
File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 184, in
logits = DenseNet(x=batch_images, nb_blocks=nb_block, filters=growth_k, training=training_flag).model
File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 84, in init
self.model = self.Dense_net(x)
File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 146, in Dense_net
x = self.transition_layer(x, scope='trans_'+str(i))
File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 113, in transition_layer
x = conv_layer(x, filter=in_channel*0.5, kernel=[1,1], layer_name=scope+'_conv1')
TypeError: unsupported operand type(s) for *: 'Dimension' and 'float'

I think this may be that in_channel is not an integer after _0.5. When I change it to x = conv_layer(x, filter=in_channel_1, kernel=[1,1], layer_name=scope+'_conv1'),
the network can run.

Then I use
Print(x)
Print(in_channel) to check their size。
I found that x and in_channel have two values.
Tensor("trans_0/Relu:0", shape=(?, 6, 6, 72), dtype=float32)
72
Tensor("trans_1/Relu:0", shape=(?, 3, 3, 120), dtype=float32)
120

From the above it seems that *0.5 should also be satisfied, and x and in_channel should also have only one value?
I want to know why this problem happened?

I also use * 0.5, but the code doesn't work. You can try filter = in_channel/2. The effect is the same, and print (x) and print (in_channel) are the same.

@NatalieZou @taki0112
When I use this code, the program is reported as follows:

Traceback (most recent call last):
File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 184, in
logits = DenseNet(x=batch_images, nb_blocks=nb_block, filters=growth_k, training=training_flag).model
File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 84, in init
self.model = self.Dense_net(x)
File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 146, in Dense_net
x = self.transition_layer(x, scope='trans_'+str(i))
File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 113, in transition_layer
x = conv_layer(x, filter=in_channel*0.5, kernel=[1,1], layer_name=scope+'_conv1')
TypeError: unsupported operand type(s) for *: 'Dimension' and 'float'

I think this may be that in_channel is not an integer after _0.5. When I change it to x = conv_layer(x, filter=in_channel_1, kernel=[1,1], layer_name=scope+'_conv1'),
the network can run.

Then I use
Print(x)
Print(in_channel) to check their size。
I found that x and in_channel have two values.
Tensor("trans_0/Relu:0", shape=(?, 6, 6, 72), dtype=float32)
72
Tensor("trans_1/Relu:0", shape=(?, 3, 3, 120), dtype=float32)
120

From the above it seems that *0.5 should also be satisfied, and x and in_channel should also have only one value?
I want to know why this problem happened?
maybe there are two transition_layer, nb_block = 2

sir,I run this example, I found it may be error ,where the error code is " x = conv_layer(x, filter=in_channel * 0.5, kernel=[1, 1], layer_name=scope + '_conv1')
TypeError: unsupported operand type(s) for *: 'Dimension' and 'float' ",i can't solve it,so i want to find some ideas please.