mhauskn/dqn-hfo

Understanding Gradient Inversion

Opened this issue · 6 comments

Sorry, if this appears to be a stupid question. I am trying to implement gradient inversion using PyTorch based on the paper but I would like to ask for some clarifications. Is the inversion done on all the layers? or is it just done on the last layer? If it's the former case, we would have to keep the output of each layer

Thanks a lot for your help in advance

The Inverted Gradients technique is applied on the back-propagation gradients output of the critic before they are applied to the actor. The relevant code is in dqn.cpp, lines 922-965 (

DLOG(INFO) << " [Backwards] " << critic_net_->name();
).

I'm implementing a similar algorithm in Keras. I'll be interested in seeing your PyTorch implementation when you have it up and running.

The Inverted Gradients technique is applied on the back-propagation gradients output of the critic before they are applied to the actor. The relevant code is in dqn.cpp, lines 922-965 (

DLOG(INFO) << " [Backwards] " << critic_net_->name();

).
I'm implementing a similar algorithm in Keras. I'll be interested in seeing your PyTorch implementation when you have it up and running.

hi,
can you share your implementaion??
if can't, could you share your experiment result??

This code is very raw, and it there are some problems with the learning, which is slow and unstable. Specifically, the 4 discrete action values (used to probabilistically select which of the four actions will be executed) eventually all move close to 1.0 and fluctuate slightly, making it too easy for the agent to select the wrong action.

The code is available in this repository:
https://github.com/wbwatkinson/ddpg-hfo-python

And you should be looking at lines 488-507 (https://github.com/wbwatkinson/ddpg-hfo-python/blob/6989b849eb9b90e03fbecaf49463a11505ab92bf/src/ddpg.py#L488). I'm a bit new to Python, so no guarantees that this is pythonic. Also, as mentioned I have at least one error somewhere in the code, but I don't think it is in the inverting gradients algorithm. I welcome any feedback you or anyone has.

I had an error in the code... corrected now (https://github.com/wbwatkinson/ddpg-hfo-python). Unless there are other questions about the inverting gradients algorithm, I recommend closing this.

I had an error in the code... corrected now (https://github.com/wbwatkinson/ddpg-hfo-python). Unless there are other questions about the inverting gradients algorithm, I recommend closing this.

hi, can you tell me which one you correct??

I think it would be best to discuss the specifics of the Python code in the other repository. That said, I made two changes that stabilized learning.

  1. Correction to inverting gradients algorithm (I had a sign error in the calculation):
    wbwatkinson/ddpg-hfo-python@357170d#r33838064
  2. Added gradient clipping