Gradient of the mask with respect to importance map

Question

Gradient of the mask with respect to importance map

Closed this issue a year ago · 23 comments

frajem commented 6 years ago

Hello,
I'm trying to implement equation (7) in the paper but I'm having some troubles.

What is the shape of the gradient?
Can you post a pseudo-code of how that equation should be implemented?

Thanks!

limuhit commented 6 years ago

Yes

Answer 1 · 2018-10-28T16:15:55.000Z

The gradient of the important mask is the shape of n\times 64 \times (h/8) \times (w/8), n is the batch size, h and w are the height and width of the input image. The gradient of the important map is the shape of n \times 1 \times (h/8) \times (w/8). The equation (7) is used to map the gradient of the important mask to the important map. Let's set the importance map value is x, [x \times 16] is the quantized importance value, [.] means float2int. The gradient of x is just to add the gradient of the important mask is the channel from [x \times 16] \times 4- 4 to [x \times 16]\times 4+4, a window of the size 8.

Answer 2 · 2018-10-28T16:53:19.000Z

Thanks for the reply.
You wrote that "The gradient of the important mask is the shape of n\times 64 \times (h/8) \times (w/8)" and then "The gradient of the important map is the shape of n \times 1 \times (h/8) \times (w/8)". Can you please clarify?
As for my understanding (please correct me if I'm wrong):

The quantized importance map has shape [batch_size, 1, h/8, w/8] and its values can be one out of L=16 quantized values.
The mask has shape [batch_size, n=64, h/8, w/8] and can have either 0 or 1 values.
So, I assume that the gradient wrt the importance map has same size as the importance map, that is [batch_size, 1, h/8, w/8].

In your final example, " The gradient of x is just to add the gradient of the important mask is the channel from [x \times 16] \times 4- 4 to [x \times 16]\times 4+4, a window of the size 8", I didn't understand what the final value is for the gradient wrt x (i.e., p_ij in the equation). In the equation, you use L as the final value, so does it mean that you assign 16 as the gradient wrt p_ij?

Thanks again!

Answer 3 · 2018-10-28T17:26:26.000Z

To begin with, 16 is not the gradient. It is a constant which should always reduce importance map p_ij to 0.

Set the important mask w.r.t. p_ij is a vector [m_0ij, ... ,m_63ij]. The gradient of the important mask is [f_0ij, ..., m_63ij]. q_ij=Q(p_ij \times 16) is the quantized importance value. Then, the gradient of p_ij is g(q_ij)=f_(q_ij-4)ij+...+f_(q_ij+3)ij. According to equation 7, you just need to multiply a constant L for the gradient, that is g(q_ij) \times L.

Answer 4 · 2018-10-29T12:54:23.000Z

OK thanks! So, the gradient wrt p_ij is:
g(q_ij) = L * ( f(q_ij-4) + ... + f(q_ij+3) )
Correct?

And, for the case in which L=n (i.e., importance map is quantized to same number of levels as the number of channels of encoder's output), the gradient wrt p_ij is:
g(q_ij) = n * ( f(q_ij-1) + f(q_ij) + f(q_ij+1) )
Correct?

Answer 5 · 2019-03-30T19:39:56.000Z

I'm still pretty confused by the above description.

(1) Where did the function f(q_ij) come from? What does it represent?
(2) The claim is that the derivative of m_kij with respect to p_ij is:
g(q_ij) = n * ( f(q_ij-1) + f(q_ij) + f(q_ij+1) )

What does this line mean? is q_ij-1 the (i,j)-th element of q, subtracted by 1? Let's say q_ij = 7. Is the (i,j)-th element of the gradient ∂m/∂p = n * ( f(6) + f(7) + f(8) )?

Answer 6 · 2019-03-31T06:18:45.000Z

I'm still pretty confused by the above description.

(1) Where did the function f(q_ij) come from? What does it represent?
(2) The claim is that the derivative of m_kij with respect to p_ij is:
g(q_ij) = n * ( f(q_ij-1) + f(q_ij) + f(q_ij+1) )

What does this line mean? is q_ij-1 the (i,j)-th element of q, subtracted by 1? Let's say q_ij = 7. Is the (i,j)-th element of the gradient ∂m/∂p = n * ( f(6) + f(7) + f(8) )?

As I mentioned above q_ij is the quantization value of the value p_ij in the importance map. It is an integer and an index. We make use of it to set the importance mask m at the position (i,j) by setting the channels of m_{k,i,j} =0 if k>=q_ij. f(k) represents the gradient with respect to m_{k,i,j}.

Answer 7 · 2019-04-01T17:03:15.000Z

Thank you for the above reply. To clarify that I understand you, can you verify an example for me? Let's say we have an importance map, p, with shape [14,14,1], a quantized map q(p) with shape [14,14,1], and an importance map function M(q(p)) with shape [14,14,8].

Let some arbitrary p_ij = 0.50
Then, q(p)_ij = 0.50 * L = 4
and M(q(p))_ij = [1, 1, 1, 1, 0, 0, 0, 0]
and the gradients of M with respect to q would be:
M'(q(p))_ij = [0, 0, grad * L, grad *L, 0, 0, 0, 0]

Answer 8 · 2019-04-02T15:09:47.000Z

Thank you for the above reply. To clarify that I understand you, can you verify an example for me? Let's say we have an importance map, p, with shape [14,14,1], a quantized map q(p) with shape [14,14,1], and an importance map function M(q(p)) with shape [14,14,8].

Let some arbitrary p_ij = 0.50
Then, q(p)_ij = 0.50 * L = 4
and M(q(p))_ij = [1, 1, 1, 1, 0, 0, 0, 0]
and the gradients of M with respect to q would be:
M'(q(p))_ij = [0, 0, grad * L, grad *L, 0, 0, 0, 0]

You are right.

Answer 9 · 2020-01-07T11:24:17.000Z

How to apply quantization as it is non-differentiable and making the error of gradient 0?

Answer 10 · 2020-01-07T13:07:53.000Z

As indicated in the paper, we employ quantization operation in forward-propagation and the identity function f(x)=x in backward-propagation.

Answer 11 · 2020-01-07T13:16:32.000Z

ca you explain it through code. I am getting errors

…

On Tue, 7 Jan 2020, 18:37 Mu Li, ***@***.***> wrote: As indicated in the paper, we employ quantization operation in forward-propagation and the identity function f(x)=x in backward-propagation. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#2?email_source=notifications&email_token=AH6YVG7KPJU66P6MKLCVCK3Q4R5CTA5CNFSM4F7YFHXKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIIZQYY#issuecomment-571578467>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AH6YVGYZJB7UPODIZRKENGLQ4R5CTANCNFSM4F7YFHXA> .

Answer 12 · 2020-01-07T14:55:41.000Z

I am not sure which quantization you are questioning. The one for importance map or the one for codes. By the way, could you report the errors in detail? Maybe I could help to finger out the problem.

Answer 13 · 2020-01-07T15:00:31.000Z

Sir actually i have designed my own encoder decoder network. I want to quantize the encoded output to further fed it into the decoder. Since quantization step is non differentiable and make the training difficult, i am studying your paper, which i found very useful. But after applyting simple quantization function it is giving me gradient 0 or gardient undefined error. I dont know than how to remove this non- differentiablity problem of quantization. As here no weights are present for learning

…

On Tue, 7 Jan 2020, 20:25 Mu Li, ***@***.***> wrote: I am not sure which quantization you are questioning. The one for importance map or the one for codes. By the way, could you report the errors in detail? Maybe I could help to finger out the problem. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#2?email_source=notifications&email_token=AH6YVGYDD54MBW4WSDWCCJTQ4SJW3A5CNFSM4F7YFHXKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIJD5HI#issuecomment-571621021>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AH6YVG6DEDUDYDVU5YDXAZLQ4SJW3ANCNFSM4F7YFHXA> .

Answer 14 · 2020-01-07T15:03:35.000Z

I see the problem. I guess you are using a framework with auto-gradient. If so, you should design your own quantization operation and write your own backward function. In the backward function, just copy the gradient from the next layer and backward it to the previous layer.

Answer 15 · 2020-01-07T15:10:26.000Z

But i am not getting how to define separate quantization function duting back propogation in keras. I understand that i have to write one quantization function for forward propagation.a And another for ...back propagation. But it is not possible in keras

…

On Tue, 7 Jan 2020, 20:33 Mu Li, ***@***.***> wrote: I see the problem. I guess you are using a framework with auto-gradient. If so, you should design your own quantization operation and write your own backward function. In the backward function, just copy the gradient from the next layer and backward it to the previous layer. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#2?email_source=notifications&email_token=AH6YVGZFRRAKUSQQBFRLPOLQ4SKURA5CNFSM4F7YFHXKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIJEYCY#issuecomment-571624459>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AH6YVG5OVZA6Z66YVMZN2JLQ4SKURANCNFSM4F7YFHXA> .

Answer 16 · 2020-01-07T15:34:10.000Z

It seems Keras is built on TensorFlow. Please check https://github.com/tensorflow/compression for more reference. I am not quite familiar with Keras. Hope this git project would be helpful.

Answer 17 · 2020-01-07T18:53:58.000Z

sir can u please guide me,,how to pass signal from next layer to previous layer while back propagation

…

On Tue, Jan 7, 2020 at 7:03 AM Mu Li ***@***.***> wrote: I see the problem. I guess you are using a framework with auto-gradient. If so, you should design your own quantization operation and write your own backward function. In the backward function, just copy the gradient from the next layer and backward it to the previous layer. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#2?email_source=notifications&email_token=AH6YVGZFRRAKUSQQBFRLPOLQ4SKURA5CNFSM4F7YFHXKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIJEYCY#issuecomment-571624459>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AH6YVG5OVZA6Z66YVMZN2JLQ4SKURANCNFSM4F7YFHXA> .

Answer 18 · 2020-01-26T13:42:49.000Z

Respected Sir, can you please make this link active again Earlier it was active, but now it is not working http://www2.comp.polyu.edu.hk/~15903062r/content-weighted-image-compression.html I will be gratelful to you

…

On Tue, Jan 7, 2020 at 6:37 PM Mu Li ***@***.***> wrote: As indicated in the paper, we employ quantization operation in forward-propagation and the identity function f(x)=x in backward-propagation. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#2?email_source=notifications&email_token=AH6YVG7KPJU66P6MKLCVCK3Q4R5CTA5CNFSM4F7YFHXKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIIZQYY#issuecomment-571578467>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AH6YVGYZJB7UPODIZRKENGLQ4R5CTANCNFSM4F7YFHXA> .

Answer 19 · 2020-01-26T14:08:41.000Z

Respected Sir, can you please make this link active again Earlier it was active, but now it is not working http://www2.comp.polyu.edu.hk/~15903062r/content-weighted-image-compression.html I will be gratelful to you
…
On Tue, Jan 7, 2020 at 6:37 PM Mu Li @.***> wrote: As indicated in the paper, we employ quantization operation in forward-propagation and the identity function f(x)=x in backward-propagation. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#2?email_source=notifications&email_token=AH6YVG7KPJU66P6MKLCVCK3Q4R5CTA5CNFSM4F7YFHXKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIIZQYY#issuecomment-571578467>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH6YVGYZJB7UPODIZRKENGLQ4R5CTANCNFSM4F7YFHXA .

Sorry that the link is inactive. As far as I know, our department server is now under upgrading. Considering these days are holidays in Hong Kong, it would need a few days before the server resume.

Answer 20 · 2020-01-26T14:12:21.000Z

Sir can you please tell, approximately how much time will it take?

…

On Sun, 26 Jan 2020, 19:38 Mu Li, ***@***.***> wrote: Respected Sir, can you please make this link active again Earlier it was active, but now it is not working http://www2.comp.polyu.edu.hk/~15903062r/content-weighted-image-compression.html I will be gratelful to you … <#m_-2416390831962947624_> On Tue, Jan 7, 2020 at 6:37 PM Mu Li *@*.***> wrote: As indicated in the paper, we employ quantization operation in forward-propagation and the identity function f(x)=x in backward-propagation. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#2 <#2>?email_source=notifications&email_token=AH6YVG7KPJU66P6MKLCVCK3Q4R5CTA5CNFSM4F7YFHXKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIIZQYY#issuecomment-571578467>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH6YVGYZJB7UPODIZRKENGLQ4R5CTANCNFSM4F7YFHXA . Sorry that the link is inactive. As far as I know, our department server is now under upgrading. Considering these days are holidays in Hong Kong, it would need a few days before the server resume. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#2?email_source=notifications&email_token=AH6YVG2JTVC5KNLKYKGEMULQ7WKOVA5CNFSM4F7YFHXKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJ5UUEI#issuecomment-578505233>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AH6YVG42ZVWXY6QDND5RADDQ7WKOVANCNFSM4F7YFHXA> .

Answer 21 · 2020-01-26T15:43:04.000Z

Actually, I have no idea about the detail of the upgrading schedule. It is hard to say how much time it will take.

Answer 22 · 2020-01-29T07:18:39.000Z

Actually Sir, I just wanted to have those result values for comparison. if you have saved these values with images, please provide me (due to server maintenance). i need them

…

On Sun, Jan 26, 2020 at 9:13 PM Mu Li ***@***.***> wrote: Actually, I have no idea about the detail of the upgrading schedule. It is hard to say how much time it will take. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#2?email_source=notifications&email_token=AH6YVG6VG27JVMBFRWZKYVDQ7WVQRA5CNFSM4F7YFHXKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJ5WZRI#issuecomment-578514117>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AH6YVG2NJ7OB6XCNIKV7XRLQ7WVQRANCNFSM4F7YFHXA> .