What's the between sum_rate and reward?
Closed this issue · 7 comments
I find the reward is defined as sum rate capacity, but I am confused why the sum rate doesn't equal reward in figures.
can you be more specific on that?
yes, there's an inconsistency between the two figures. however, note that the used hyperparameters are different for these figures; otherwise, they would've produced the same results. the authors didn't provide any hyperparameter setting for such particular learning curves, and I don't remember which hyperparameter values I used to produce Fig. 6 unfortunately. I've taken a look at the paper, but still couldn't find any information.
please let me know if anything else, and if you find the used hyperparameter values for Fig. 6.
I believe this is expected since you increased the number of users as well. increasing the number of users would degrade the performance.
Thx for your explanation. I have changed the configuration, where the only distinction is the number of RIS element (N) like the following. By the way, I consider the sum rate in figure 4 may be the opt_reward rather than current_reward. opt_reward is the SNR rather than SINR. In that case, we will get larger sum rate.
yes, this is what's expected. when you increase the number of RIS elements, you'd obtain more transmission power as well as effective performance.
regarding the SNR/SINR, thank you for pointing this out, but I'm not the author of the paper, so I only tried to reproduce the figures more or less the same. authors didn't provide much detail such as which objective (as a reward) they used, hyperparameter settings, etc.