guicho271828/latplan

Hi, could you please explain more for your code?

RichieLee93 opened this issue · 26 comments

Hi, could you please explain more for your code?

I just read your paper "Classical Planning in Deep Latent Space: Bridging the Subsymbolic-Symbolic Boundary" and find it is interesting. Now I want to run your model to see the structure and result, but it seems you have really many files... Could you tell me more details on running the model? Like the sequence of running each python file, what is its goal and what's the corresponding results it should generate?

Thank you very much

wonder if you are checking the latest commits. README talks about the workflow in more detail now?

hi, how is it going?

hi, how is it going?

Hi,

Sorry for my late reply. I have tried to run your model. But still a bit confused for the pipeline:

  1. After running the setup-dataset.sh, it downloads some datasets but when I run train_all.sh, it will download dataset from AWS again.

  2. Also, since Im using a GPU equipped laptop, I just run one game instead to avoid damaging my laptop. When I run:
    ./strips.py $mode puzzle mnist 3 3 {} "planning" ::: 5000 ::: None ::: None ::: None ::: False ::: ConcreteDetNormalizedLogitAddEffectTransitionAE
    What is the goal of this command line? Is it generating SAE for AMA2? How can I make it smaller to run faster or less RAM? Because when I run it, my laptop is really hot and makes noise.

  3. After that, the next step should be building AMA2 model? I should run run_ama2_all.sh?

Thank you

By the way, I set smaller sample size as 500 and batch size 100 epoch 100, but still seems take long time. May I know if I only want to run AMA2 model to see its result, do I necessarily to run train_all.sh first? Otherwise whats the sequence of scripts should I run on by one?

Thank you and sorry for taking time.

Don't feel sorry, in fact I have the incentive to increase the number of people interested in my paper. Like every author does 😎

After running the setup-dataset.sh, it downloads some datasets but when I run train_all.sh, it will download dataset from AWS again.

Oh, that sounds bad. But I need more detailed information to debug that.

Also, since Im using a GPU equipped laptop, I just run one game instead to avoid damaging my laptop.

No, that does not break your machine. When the fan spins up and make noise, it is a good sign. I will be more worried if it does not spin up, because that means the fan is malfunctioning. In the end, there is always a thermal throttling. Tell me what your machine is, btw.

./strips.py $mode puzzle mnist 3 3 {} "planning" ::: 5000 ::: None ::: None ::: None ::: False ::: ConcreteDetNormalizedLogitAddEffectTransitionAE
What is the goal of this command line? Is it generating SAE for AMA2?

This replicates my new IJCAI paper. You need to read them first.
It is faster, simpler and stronger. Trash AMA2.

How can I make it smaller to run faster or less RAM?

You can make the hyperparameter smaller, but the accuracy will drop. There is no easy way to do this, and model compression etc. is by itself a research topic.

Because when I run it, my laptop is really hot and makes noise.

When you run a compute-intensive task on a computer, that is an expected behavior. That is how a fully-working computer should be. The fact that you are surprised by it means that you have never pushed your computer to its limit before.

After that, the next step should be building AMA2 model? I should run run_ama2_all.sh?

If you want to train AMA2 (despite there being a much better model in the latest paper), then comment out this line: https://github.com/guicho271828/latplan/blob/master/train_all.sh#L156
This one trains the SAE for AMA2. It reuses the best hyperparameter I found (called reproduce mode), so it does not have to do all the tuning I did. The network is "VanillaTransitionAE" which processes two states at once, but it is just two SAEs (sharing weights) running in parallel, therefore its network is identical to SAE. Just look at the source code and it should be easy to understand, if you are already familiar with neural networks and python and code reading in general.

After training the SAE, AMA2 needs two or three additional networks (AAE, AD, and optionally SD). The script for it is in train_aae.sh, as documented in the readme.

By the way, I set smaller sample size as 500 and batch size 100 epoch 100, but still seems take long time.

Sample size 500 would be way too smaller to learn anything.

With a decent GPU, one training with 5000 num_examples should finish in about two hours. However, if it is on a laptop, I highly doubt it finishes that fast. I have a GeFo 1070, not the latest model.

Finally, the reproduce mode trains the same network 3 times and take the best result in order to get the consistent result. If the training time matters, you can reduce it to just 1 iteration by adding limit = 1 to https://github.com/guicho271828/latplan/blob/master/strips-vanilla.py#L160 .

May I know if I only want to run AMA2 model to see its result, do I necessarily to run train_all.sh first? Otherwise whats the sequence of scripts should I run on by one?

Hmm, but this is a good question -- are you happy if I provide the trained weight? I should consider this.

Hi, thank you for your detailed reply.

Tell me what your machine is, btw.
Currently im using Ubuntu18.04 system with 12 cores i7 CPU and one 2080Max-Q GPU.

It is faster, simpler and stronger. Trash AMA2.
Congrats for your new paper! I will read it since it could be better than AMA2.

You can make the hyperparameter smaller, but the accuracy will drop. There is no easy way to do this, and model compression etc. is by itself a research topic.
I have changed the batch size to 64, which means it will take a longer time...But since I just want to run the model to see how the whole pipeline works and implement on my own data, the accuracy is not so urgent for me.

Now since my computer resource is limited, I just run one game (mnist puzzle8) instead of running all the games in parallel, which means I will comment most of the lines in tran_all.sh and only keep:
task-planning(){
./strips.py $mode puzzle mnist 3 3 {} "planning" ::: 5000 ::: None ::: None ::: None ::: False ::: ConcreteDetNormalizedLogitAddEffectTransitionAE
}
task-planning reproduce_plot_dump_summary

Also, what is the difference between task-planning and task-vanilla? Which one should I use for training SAE?
I changed back the num_sample as 5000 as you suggested and batch size 64 epoch 100 to see how long it will take.

Hmm, but this is a good question -- are you happy if I provide the trained weight? I should consider this.

Maybe I should try to train by myself first, if my laptop really cannot handle it (like bump into OOM issue), I will turn to you for help.

Thank you so much!

If I may drop into the discussion with a semi-random comment, while your laptop GPU looks great, my experience is that if you are using it with a computer running a windowing system and it is connected to an external monitor, you will use up almost 2GBs of that memory just to run the UI (more if you are using more than one 4k monitor). You will also be using a lot of the bandwidth constantly just to run the UI. If you have access to a desktop that you can run headless, this should make a lot of difference for the same configuration.

If I may drop into the discussion with a semi-random comment, while your laptop GPU looks great, my experience is that if you are using it with a computer running a windowing system and it is connected to an external monitor, you will use up almost 2GBs of that memory just to run the UI (more if you are using more than one 4k monitor). You will also be using a lot of the bandwidth constantly just to run the UI. If you have access to a desktop that you can run headless, this should make a lot of difference for the same configuration.

Thank you for your advice. Currently Im not using any external monitor and only use the small screen of my laptop. Also, I can access a desktop remotely but the desktop does not have GPU so Im afraid it wont be better than the laptop with GPU. I try to use smaller batch size and sample data size, even it will be much slower:)

re: GPU, even my 4-years old 1070 is still twice faster than yours. They are selling as used in like < $200 on ebay. I believe upgrading your desktop GPU is a much better option.
https://gpu.userbenchmark.com/Compare/Nvidia-RTX-2080-Mobile-Max-Q-vs-Nvidia-GTX-1070/m704710vs3609

oops, the numbers were the upvotes, not the scores. Hmm, your 2080 actually seems to be faster. what is the difference then?

Hi,

Hmm, your 2080 actually seems to be faster. what is the difference then?

Yea, runs faster even though the computer is really hot:)

Again, may I ask what is the difference between task-planning and task-vanilla? Which one should I use for training SAE?
Also, since your new paper generated a new AMA2, may I ask how to run it?

Thank you

task-planning is for Cube-Space AE, task-vanilla is for training an SAE.

task-planning is for Cube-Space AE, task-vanilla is for training an SAE.

Then I still need to run task-vanilla first then running task-planning, since only after we have the symbolic representations we can generate actions via AE?

But the thing is that, I still can run task-planning without running task-vanilla first. Is it not a sequence? We can still run AE without training SAE?

If you have read the paper, it must be pretty obvious that Cube-Space AE and SAE are separate and neither depend on each other.

Thank you for the reply. Now I am training AAE for AMA2 using train_aae.sh.
But I find that in the bash file, you set the number of actions as None. When it passes this parameter to action_autoencoder.py, the variable "aae" will be empty which cause the "no attribute named 'parameter'" error in action_autoencoder.py line 95. Could you help to check line 73-95 in action_autoencoder.py? Should I set a number to num_actions instead of making it None?

num_actions being None is not a problem, since it tells it to tune the hyperparameter "M" from a list of values. https://github.com/guicho271828/latplan/blob/master/action_autoencoder.py#L42

In the train mode, when a particular value is specified, it fixes the parameter. https://github.com/guicho271828/latplan/blob/master/action_autoencoder.py#L75

Int the reproduce mode, the value "M" is read from the hyperparameter json.

Did you actually run the script? Which line did you uncomment in train_aae ?

Did you actually run the script? Which line did you uncomment in train_aae ?

I uncommented line 20-25 in train_aae.sh but changed the line 22 from $base/*/ to $base/puzzle_mnist_3_3_5000_None_None_None_False_VanillaTransitionAE_vanilla/ because I only trained SAE for puzzle mnist in train_all.sh rather than all games.
Then I checked the best["artifact"] in tuning.py is None, which means it doesnot return aae parameters as expected.

Hi, I found the length of the open_list in tuning.py is around 109 which is much larger than the initial_population or limit (they are 20 and 100). Since im only run the mnist puzzle, is it normal? Also, I observed the best['artifact'] in tuning.py, which is the same as "aae" in action_autoencoder.py, is never updated? Is it due to the open_list is too large or other issues?

if you use the reproduce mode, it loads up the history of hyperparameters and picks the best one. The best parameter should not be updated.
It seems that I accidentally committed 109 logs instead of 100.

Then how can I fix it?

Thank you and sorry for taking time.

Actually I tried to change the limit to 110, but still got empty aae. So any other methods?

bit busy for neurips, sry.

Today I am running a couple of major updates.

  • updated the installation procedure. Now I recommend conda based installation, which specifies the correct Keras + TF versions.
  • The repository is now reorganized so that the library code goes to latplan directory and all other launcher code remains in the root directory.
  • I reran my scripts and it runs without error, also the training loss goes quite close to zero with the reproduce mode.

will be uploaded soon.

closing due to the latest release v5.0.0 .