
`optimizer.step()` before `lr_scheduler.step()` Warning Occurred

I really appreciate to you for your book, It's a great help for me to start RL. ^^

Describe the bug
When executing example code 4.7 (vanilla_dpn without any change), there comes a warning msg as below

To Reproduce

  1. OS and environment: Ubuntu 20.04
  2. SLM Lab git SHA (run git rev-parse HEAD to get it): 5fa5ee3 (from the file "SLM-lab/data/vanilla_dqn_boltzmann_cartpole_2022_07_15_092012/vanilla_dqn_boltzmann_cartpole_t0_spec.json")
  3. spec file used: SLM-lab/slm_lab/spec/benchmark/dqn/dqn_cartpole.json

Additional context
After it occurred, it proceeded too slow (it took over an hour) than other methods (15 minutes for SARSA), and the result is also strange that mean_returns_ma decreases gradually to about 50 after 30k frames.
I wonder the result of this trial is related to the warning situation

Error logs

[2022-07-15 09:20:14,002 PID:245693 INFO info] Running RL loop for trial 0 session 3
[2022-07-15 09:20:14,006 PID:245693 INFO log_summary] Trial 0 session 3 vanilla_dqn_boltzmann_cartpole_t0_s3 [train_df] epi: 0  t: 0  wall_t: 0  opt_step: 0  frame: 0  fps: 0  total_reward: nan  total_reward_ma: nan  loss: nan  lr: 0.01  explore_var: 5  entropy_coef: nan  entropy: nan  grad_norm: nan
/home/eric/miniconda3/envs/lab/lib/python3.7/site-packages/torch/optim/ UserWarning:

Detected call of `lr_scheduler.step()` before `optimizer.step()`. In PyTorch 1.1.0 and later, you should call them in the opposite order: `optimizer.step()` before `lr_scheduler.step()`.  Failure to do this will result in PyTorch skipping the first value of the learning rate schedule.See more details at

