shidi1985/L2RPN

Reproduce results reported in your paper 'AI-Based Autonomous Line Flow Control via Topology Adjustment for Maximizing Time-Series ATCs'

tanzeyy opened this issue · 4 comments

Hello,
We are trying to reproduce the results you reported in your paper 'AI-Based Autonomous Line Flow Control via Topology Adjustment for Maximizing Time-Series ATCs' recently. However, we have encountered several issues that may be caused by our misunderstanding of your paper. We hope you could provide more details for us to reduce the gap between our results and yours, thanks!

Here are our concerns:

1. Details about how to divide the 28-day scenarios into single days

Our first concern is how to divide the 28-day scenarios into single days, as you mentioned in Section IV. C. According to our understanding, since each day is already split into 288 timesteps, we simply divide each scenario according to the date timestamp, corresponding to each row in other .csv files. For example, a divided day only has its own timesteps, starting from 00:00 to 23:55. In _N_datetimes.csv, the rows can be like:

date time
2018-Jan-01 00:00
... ...
2018-Jan-01 23:55

Then, the rows in other .csv files are divided into different .csv files, respectively. Note that there's one special date Jan-29 which only has one timestep, we simply remove it from all .csv files to keep the logic simple.

We divided all the scenarios in the public dataset provided by the competition host following the above method, then shuffle all the single days as a large dataset. Then, we use this large dataset for further training and testing, as we described next. We want to know if there's anything wrong with our way of dividing the dataset, feel free to correct us!

2. Evaluate metrics used in the figures

Now, we want to reproduce Fig. 5(b) in your paper, using the dataset we obtained, and the pre-trained weights you guys provided. Due to the lack of description of the reward label on the y-axis, we investigated your code and found the closest indicator is the reward_tot (https://github.com/shidi1985/L2RPN/blob/master/pypownet/runner.py#L657). By using this as the reward label, we found it hard to get a training curve like the one in Fig. 5(b). We think we may have made some mistakes about this indicator, hope you guys can correct us!

3. Reproduce Table I

We also tried to reproduce Table I, using the code and the weights you provided, in our divided dataset. We picked 200 single-day-scenarios randomly from the large dataset as our test set, then run your agent with the code you provided and the weights in the example_submission folder. However, there are always several Game-Overs during our tests, around a ratio of 7% (e.g., 14 out of 200). Also we tried our own trained agent and other hyperparameters, we still can not get a good result.

In summary, we think these issues may be encountered by our misunderstanding of how to divide the dataset, hope you guys could provide some hints about this for us, thanks! We are looking forward to your reply!

Hi,
I will try to answer your questions.
Questions 1 and 3 are actually very similar. Your method is one of the right ways. But the problem is that when you slice the monthly data into days in this way, you need to guarantee the agent will not be game over at the beginning stage of a day. In other words, for example, if the right action is not taken at the end of the first day, it may game over directly at the beginning of the second day. Therefore, such a problem should be avoided. Another method is that you can shrink monthly data to daily data, namely the agent runs every 10-20 time steps.
For the second question, the data should be generated in generate_rewards function of runner.py , there are three different y axis, reward, score and tot. I think we used the score in the end. You can try it.
One more important thing is I think Pypower has been upgraded after the L2RPN2019, so it may lead to different results, which may take more time for tuning and testing. However, the overall structure of the agent we used in the competition is exactly what we have presented here in the GitHub.
Best,

Hi,
Thanks for your reply @djmax008 !
We did some other experiments and still had some doubts. For the first problem, we select the first day of each scenario since it is not affected by the actions taken the day before, then use these first-days as our evaluation set. However, we still got a result of 3% Game Overs (6 out of 200), which still doesn't match your results.
Also, the version of PyPower is just the same as in your requirements.txt, and so does other packages.

Hi,
Also, we tried with your default setting as in Run_and_Submit_agent.py. We change NUMBER_ITERATIONS in line 72 to 288, and n_test to 50. We found that there's still 1 Game Over.
We grep Game Over in the output as follows:

Chronic: 31, Game Over: 0, Score: 4077.872, Time: 18.729, Mean Score: 4457.312, Mean Time: 10.995
!!!!!!!!!!!!!!!!!!!!!!!!!!!!Game Over!!!!!!!!!!!!!!!!!!!!!!!!!!
Total Game Over: 1, time used: 12.3212
Chronic: 32, Game Over: 1, Score: 4237.528, Time: 12.321, Mean Score: 4322.242, Mean Time: 11.035
Total Game Over: 0, time used: 16.6654
Chronic: 33, Game Over: 0, Score: 4197.998, Time: 16.665, Mean Score: 4318.588, Mean Time: 11.201
Total Game Over: 0, time used: 9.4285
Chronic: 34, Game Over: 0, Score: 4603.778, Time: 9.428, Mean Score: 4326.736, Mean Time: 11.150
Total Game Over: 0, time used: 10.9075
Chronic: 35, Game Over: 0, Score: 4333.379, Time: 10.908, Mean Score: 4326.921, Mean Time: 11.144
Total Game Over: 0, time used: 10.6602
Chronic: 36, Game Over: 0, Score: 4433.970, Time: 10.660, Mean Score: 4329.814, Mean Time: 11.130
Total Game Over: 0, time used: 15.3601
Chronic: 37, Game Over: 0, Score: 4173.780, Time: 15.360, Mean Score: 4325.708, Mean Time: 11.242
Total Game Over: 0, time used: 10.6805
Chronic: 38, Game Over: 0, Score: 4334.799, Time: 10.680, Mean Score: 4325.941, Mean Time: 11.227
Total Game Over: 0, time used: 11.2002
Chronic: 39, Game Over: 0, Score: 4284.059, Time: 11.200, Mean Score: 4324.894, Mean Time: 11.227
Total Game Over: 0, time used: 9.3372
Chronic: 40, Game Over: 0, Score: 4626.674, Time: 9.337, Mean Score: 4332.254, Mean Time: 11.181
Total Game Over: 0, time used: 8.8373
Chronic: 41, Game Over: 0, Score: 4795.752, Time: 8.837, Mean Score: 4343.290, Mean Time: 11.125
Total Game Over: 0, time used: 8.8619
Chronic: 42, Game Over: 0, Score: 4740.026, Time: 8.862, Mean Score: 4352.517, Mean Time: 11.072
Total Game Over: 0, time used: 9.8913
Chronic: 43, Game Over: 0, Score: 4494.033, Time: 9.891, Mean Score: 4355.733, Mean Time: 11.045
Total Game Over: 0, time used: 14.0898
Chronic: 44, Game Over: 0, Score: 4357.381, Time: 14.090, Mean Score: 4355.769, Mean Time: 11.113
Total Game Over: 0, time used: 17.8510
Chronic: 45, Game Over: 0, Score: 4192.200, Time: 17.851, Mean Score: 4352.214, Mean Time: 11.260
Total Game Over: 0, time used: 11.6437
Chronic: 46, Game Over: 0, Score: 4276.758, Time: 11.644, Mean Score: 4350.608, Mean Time: 11.268
Total Game Over: 0, time used: 16.0456
Chronic: 47, Game Over: 0, Score: 4236.782, Time: 16.046, Mean Score: 4348.237, Mean Time: 11.367
Total Game Over: 0, time used: 8.7390
Chronic: 48, Game Over: 0, Score: 4672.577, Time: 8.739, Mean Score: 4354.856, Mean Time: 11.314
Total Game Over: 0, time used: 10.6405
Chronic: 49, Game Over: 0, Score: 4451.560, Time: 10.641, Mean Score: 4356.790, Mean Time: 11.300
Finished 50 Chronics 288 Steps, Mean Score: 4356.790, Mean Time: 11.300, Game Over: [32]

Hope you guys could help us with this, thanks!

Hi; I will soon write my BSc thesis, where I was planning to reproduce and discuss the results of this paper. @tanzeyy, were you ever able to reproduce the reported results? If so, what more than that mentioned in the reply of @djmax008 should I be aware of, when trying to reproduce these?