RealVNF/distributed-drl-coordination

Question about SPRPolicy

burnCalories opened this issue · 6 comments

This program is great, with excellent success rates and speed. I also tried acer and ppo2 algorithms. acer can be very close to the success rate of the acktr algorithm, and ppo2 is relatively low. Recently, I was reading the source code, but I don't understand neural networks very well, so I don't quite understand rewriting MLP policy.
class SPRPolicy(FeedForwardPolicy): """ Custom policy. Exactly the same as MlpPolicy but with different NN configuration """ def __init__(self, sess, ob_space, ac_space, n_env, n_steps, n_batch, reuse=False, **_kwargs): self.params: Params = _kwargs['params'] pi_layers = self.params.agent_config['pi_nn'] vf_layers = self.params.agent_config['vf_nn'] activ_function_name = self.params.agent_config['nn_activ'] activ_function = eval(activ_function_name) net_arch = [dict(vf=vf_layers, pi=pi_layers)] super(SPRPolicy, self).__init__(sess, ob_space, ac_space, n_env, n_steps, n_batch, reuse, net_arch=net_arch, act_fun=activ_function, feature_extraction="spr", **_kwargs)
I found that deep coord seems to use a similar neural network structure. Can you briefly describe why you want to rewrite policy and what are the advantages of doing so? Looking forward to your reply:)

Hi, @stefanbschneider. I have understood the above question. Recently, I attempted to upgrade some dependency packages and code for distributed-drl-coordination to a Pytorch+Gym environment, and tried the PPO+GAE algorithm ofstable_baselines3, it was found that this algorithm can achieve similar results as the acker algorithm and improve the success rate slightly under real-world-trace.

Hi~ When I run pip install -r requirements.txt, it always times out.
error: subprocess-exited-with-error

× git clone --filter=blob:none --quiet git://github.com/RealVNF/common-utils /home/guohao/PycharmProjects/codes/venv/src/common-utils did not run successfully.
│ exit code: 128
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× git clone --filter=blob:none --quiet git://github.com/RealVNF/common-utils /home/guohao/PycharmProjects/codes/venv/src/common-utils did not run successfully.
│ exit code: 128
╰─> See above for output.
How can I solve this problem? thank you

Hi, @stefanbschneider. I have understood the above question. Recently, I attempted to upgrade some dependency packages and code for distributed-drl-coordination to a Pytorch+Gym environment, and tried the PPO+GAE algorithm ofstable_baselines3, it was found that this algorithm can achieve similar results as the acker algorithm and improve the success rate slightly under real-world-trace.

@burnCalories Great to hear! Does that mean, your question is resolved?

If you have a project using our distributed-drl-coordination code, let me know and I am happy to reference it in our readme!

@BMDACMER Hi thanks for raising the problem. Since this seems unrelated, I moved it into a separate issue and will look into it there: #3

@stefanbschneider Yes, I've figured out the problem. I am very willing to share my project with you. Currently, I have not started writing my paper and I am still trying more. Do I need to apply for a pull requests? As the Pytorch version of distributed-drl-coordination.
I am currently trying to modify to the RAY version. Because ray has richer algorithms, layered algorithms, and multi-agent algorithms, but due to the difficulty in understanding RAY's training, saving, and testing agents, I am still learning. The good news is that I have seen your relevant answers in ray's question, which is great! But I haven't been able to successfully complete the version conversion yet.

I agree, Ray RLlib provides a lot more features and I am currently mostly using Ray for RL. But due to its many features it takes a while to get used to.

Just open an issue with the link to your project/paper once it is ready and I am happy to include it in the Readme :) Good luck!

I'll close this issue in the meantime.