kengz/SLM-Lab

Crashing with seccomp-bpf failure in syscall 2030

norbertkeresztes opened this issue · 2 comments

Describe the bug
I'm going through the guides on the website and running dqn_cartpole both in dev and train results in slow runs, high resource usage and it ends with several duplication of this error message:

../../sandbox/linux/seccomp-bpf-helpers/sigsys_handlers.cc:**CRASHING**:seccomp-bpf failure in syscall 0230

To Reproduce

  1. OS and environment: Ubuntu 20.04.1
  2. SLM Lab git SHA (run git rev-parse HEAD to get it): faca82c
  3. spec file used: slm_lab/spec/demo.json

Additional context
Running on AMD TR 3990X and all 128 CPU cores are running above 90% during the run (only checked for train, not dev). These are the training metrics logged during one of the run:

[2021-07-24 10:34:11,503 PID:27435 INFO logger.py info] Running RL loop for trial 0 session 1
[2021-07-24 10:34:11,506 PID:27435 INFO __init__.py log_summary] Trial 0 session 1 dqn_cartpole_t0_s1 [train_df] epi: 0  t: 0  wall_t: 0  opt_step: 0  frame: 0  fps: 0  total_reward: nan  total_reward_ma: nan  loss: nan  lr: 0.02  explore_var: 1  entropy_coef: nan  entropy: nan  grad_norm: nan
[2021-07-24 10:36:42,739 PID:27435 INFO __init__.py log_summary] Trial 0 session 1 dqn_cartpole_t0_s1 [train_df] epi: 28  t: 4  wall_t: 151  opt_step: 18720  frame: 500  fps: 3.31126  total_reward: 9  total_reward_ma: 9  loss: 0.0168248  lr: 0.02  explore_var: 0.55  entropy_coef: nan  entropy: nan  grad_norm: nan
[2021-07-24 10:39:24,086 PID:27435 INFO __init__.py log_summary] Trial 0 session 1 dqn_cartpole_t0_s1 [train_df] epi: 54  t: 1  wall_t: 312  opt_step: 38720  frame: 1000  fps: 3.20513  total_reward: 10  total_reward_ma: 9.5  loss: 0.0740117  lr: 0.02  explore_var: 0.1  entropy_coef: nan  entropy: nan  grad_norm: nan
[2021-07-24 10:39:24,095 PID:27435 INFO __init__.py log_metrics] Trial 0 session 1 dqn_cartpole_t0_s1 [train_df metrics] final_return_ma: 9.5  strength: -12.36  max_strength: -11.86  final_strength: -11.86  sample_efficiency: 0.00152023  training_efficiency: 4.01807e-05  stability: 1
[2021-07-24 10:42:06,057 PID:27435 INFO __init__.py log_summary] Trial 0 session 1 dqn_cartpole_t0_s1 [train_df] epi: 74  t: 72  wall_t: 474  opt_step: 58720  frame: 1500  fps: 3.16456  total_reward: 21  total_reward_ma: 13.3333  loss: 0.299671  lr: 0.018  explore_var: 0.1  entropy_coef: nan  entropy: nan  grad_norm: nan
[2021-07-24 10:42:06,069 PID:27435 INFO __init__.py log_metrics] Trial 0 session 1 dqn_cartpole_t0_s1 [train_df metrics] final_return_ma: 13.3333  strength: -8.52667  max_strength: -0.860001  final_strength: -0.860001  sample_efficiency: 0.00149153  training_efficiency: 3.94024e-05  stability: 1
[2021-07-24 10:44:47,405 PID:27435 INFO __init__.py log_summary] Trial 0 session 1 dqn_cartpole_t0_s1 [train_df] epi: 84  t: 68  wall_t: 635  opt_step: 78720  frame: 2000  fps: 3.14961  total_reward: 69  total_reward_ma: 27.25  loss: 0.231153  lr: 0.018  explore_var: 0.1  entropy_coef: nan  entropy: nan  grad_norm: nan
[2021-07-24 10:44:47,413 PID:27435 INFO __init__.py log_metrics] Trial 0 session 1 dqn_cartpole_t0_s1 [train_df metrics] final_return_ma: 27.25  strength: 5.39  max_strength: 47.14  final_strength: 47.14  sample_efficiency: -0.000676407  training_efficiency: -1.89741e-05  stability: 1
[2021-07-24 10:47:30,196 PID:27435 INFO __init__.py log_summary] Trial 0 session 1 dqn_cartpole_t0_s1 [train_df] epi: 90  t: 35  wall_t: 798  opt_step: 98720  frame: 2500  fps: 3.13283  total_reward: 138  total_reward_ma: 49.4  loss: 0.102941  lr: 0.0162  explore_var: 0.1  entropy_coef: nan  entropy: nan  grad_norm: nan
[2021-07-24 10:47:30,216 PID:27435 INFO __init__.py log_metrics] Trial 0 session 1 dqn_cartpole_t0_s1 [train_df metrics] final_return_ma: 49.4  strength: 27.54  max_strength: 116.14  final_strength: 116.14  sample_efficiency: 0.000231464  training_efficiency: 5.57282e-06  stability: 1
[2021-07-24 10:50:12,254 PID:27435 INFO __init__.py log_summary] Trial 0 session 1 dqn_cartpole_t0_s1 [train_df] epi: 95  t: 69  wall_t: 960  opt_step: 118720  frame: 3000  fps: 3.125  total_reward: 175  total_reward_ma: 70.3333  loss: 0.563105  lr: 0.0162  explore_var: 0.1  entropy_coef: nan  entropy: nan  grad_norm: nan
[2021-07-24 10:50:12,266 PID:27435 INFO __init__.py log_metrics] Trial 0 session 1 dqn_cartpole_t0_s1 [train_df metrics] final_return_ma: 70.3333  strength: 48.4733  max_strength: 153.14  final_strength: 153.14  sample_efficiency: 0.000285103  training_efficiency: 7.07366e-06  stability: 1
[2021-07-24 10:52:52,672 PID:27435 INFO __init__.py log_summary] Trial 0 session 1 dqn_cartpole_t0_s1 [train_df] epi: 98  t: 21  wall_t: 1121  opt_step: 138720  frame: 3500  fps: 3.12221  total_reward: 178  total_reward_ma: 85.7143  loss: 0.468518  lr: 0.01458  explore_var: 0.1  entropy_coef: nan  entropy: nan  grad_norm: nan
[2021-07-24 10:52:52,680 PID:27435 INFO __init__.py log_metrics] Trial 0 session 1 dqn_cartpole_t0_s1 [train_df metrics] final_return_ma: 85.7143  strength: 63.8543  max_strength: 156.14  final_strength: 156.14  sample_efficiency: 0.000285316  training_efficiency: 7.12085e-06  stability: 1
[2021-07-24 10:55:34,404 PID:27435 INFO __init__.py log_summary] Trial 0 session 1 dqn_cartpole_t0_s1 [train_df] epi: 100  t: 121  wall_t: 1282  opt_step: 158720  frame: 4000  fps: 3.12012  total_reward: 200  total_reward_ma: 100  loss: 0.259016  lr: 0.01458  explore_var: 0.1  entropy_coef: nan  entropy: nan  grad_norm: nan
[2021-07-24 10:55:34,417 PID:27435 INFO __init__.py log_metrics] Trial 0 session 1 dqn_cartpole_t0_s1 [train_df metrics] final_return_ma: 100  strength: 78.14  max_strength: 178.14  final_strength: 178.14  sample_efficiency: 0.000275252  training_efficiency: 6.88705e-06  stability: 1
[2021-07-24 10:58:17,041 PID:27435 INFO __init__.py log_summary] Trial 0 session 1 dqn_cartpole_t0_s1 [train_df] epi: 103  t: 188  wall_t: 1445  opt_step: 178720  frame: 4500  fps: 3.11419  total_reward: 154  total_reward_ma: 106  loss: 0.235752  lr: 0.013122  explore_var: 0.1  entropy_coef: nan  entropy: nan  grad_norm: nan
[2021-07-24 10:58:17,050 PID:27435 INFO __init__.py log_metrics] Trial 0 session 1 dqn_cartpole_t0_s1 [train_df metrics] final_return_ma: 106  strength: 84.14  max_strength: 178.14  final_strength: 132.14  sample_efficiency: 0.000265999  training_efficiency: 6.66165e-06  stability: 0.926414
[2021-07-24 11:00:59,460 PID:27435 INFO __init__.py log_summary] Trial 0 session 1 dqn_cartpole_t0_s1 [train_df] epi: 106  t: 141  wall_t: 1607  opt_step: 198720  frame: 5000  fps: 3.11139  total_reward: 178  total_reward_ma: 113.2  loss: 0.162558  lr: 0.013122  explore_var: 0.1  entropy_coef: nan  entropy: nan  grad_norm: nan
[2021-07-24 11:00:59,467 PID:27435 INFO __init__.py log_metrics] Trial 0 session 1 dqn_cartpole_t0_s1 [train_df metrics] final_return_ma: 113.2  strength: 91.34  max_strength: 178.14  final_strength: 156.14  sample_efficiency: 0.000254717  training_efficiency: 6.38311e-06  stability: 0.939255
[2021-07-24 11:03:42,026 PID:27435 INFO __init__.py log_summary] Trial 0 session 1 dqn_cartpole_t0_s1 [train_df] epi: 109  t: 117  wall_t: 1770  opt_step: 218720  frame: 5500  fps: 3.10734  total_reward: 179  total_reward_ma: 119.182  loss: 0.0619462  lr: 0.0118098  explore_var: 0.1  entropy_coef: nan  entropy: nan  grad_norm: nan
[2021-07-24 11:03:42,037 PID:27435 INFO __init__.py log_metrics] Trial 0 session 1 dqn_cartpole_t0_s1 [train_df metrics] final_return_ma: 119.182  strength: 97.3218  max_strength: 178.14  final_strength: 157.14  sample_efficiency: 0.000244016  training_efficiency: 6.11727e-06  stability: 0.949639
[2021-07-24 11:06:24,799 PID:27435 INFO __init__.py log_summary] Trial 0 session 1 dqn_cartpole_t0_s1 [train_df] epi: 112  t: 103  wall_t: 1933  opt_step: 238720  frame: 6000  fps: 3.10398  total_reward: 155  total_reward_ma: 122.167  loss: 2.09935  lr: 0.0118098  explore_var: 0.1  entropy_coef: nan  entropy: nan  grad_norm: nan
[2021-07-24 11:06:24,808 PID:27435 INFO __init__.py log_metrics] Trial 0 session 1 dqn_cartpole_t0_s1 [train_df metrics] final_return_ma: 122.167  strength: 100.307  max_strength: 178.14  final_strength: 133.14  sample_efficiency: 0.000235461  training_efficiency: 5.90398e-06  stability: 0.934612
[2021-07-24 11:09:07,338 PID:27435 INFO __init__.py log_summary] Trial 0 session 1 dqn_cartpole_t0_s1 [train_df] epi: 115  t: 133  wall_t: 2095  opt_step: 258720  frame: 6500  fps: 3.10263  total_reward: 169  total_reward_ma: 125.769  loss: 1.67609  lr: 0.0106288  explore_var: 0.1  entropy_coef: nan  entropy: nan  grad_norm: nan
[2021-07-24 11:09:07,346 PID:27435 INFO __init__.py log_metrics] Trial 0 session 1 dqn_cartpole_t0_s1 [train_df metrics] final_return_ma: 125.769  strength: 103.909  max_strength: 178.14  final_strength: 147.14  sample_efficiency: 0.000226571  training_efficiency: 5.6819e-06  stability: 0.941845
[2021-07-24 11:11:50,149 PID:27435 INFO __init__.py log_summary] Trial 0 session 1 dqn_cartpole_t0_s1 [train_df] epi: 118  t: 146  wall_t: 2258  opt_step: 278720  frame: 7000  fps: 3.10009  total_reward: 153  total_reward_ma: 127.714  loss: 2.25914  lr: 0.0106288  explore_var: 0.1  entropy_coef: nan  entropy: nan  grad_norm: nan
[2021-07-24 11:11:50,158 PID:27435 INFO __init__.py log_metrics] Trial 0 session 1 dqn_cartpole_t0_s1 [train_df metrics] final_return_ma: 127.714  strength: 105.854  max_strength: 178.14  final_strength: 131.14  sample_efficiency: 0.000219163  training_efficiency: 5.4966e-06  stability: 0.936335
[2021-07-24 11:14:32,865 PID:27435 INFO __init__.py log_summary] Trial 0 session 1 dqn_cartpole_t0_s1 [train_df] epi: 121  t: 91  wall_t: 2421  opt_step: 298720  frame: 7500  fps: 3.09789  total_reward: 177  total_reward_ma: 131  loss: 1.16432  lr: 0.00956594  explore_var: 0.1  entropy_coef: nan  entropy: nan  grad_norm: nan
[2021-07-24 11:14:32,873 PID:27435 INFO __init__.py log_metrics] Trial 0 session 1 dqn_cartpole_t0_s1 [train_df metrics] final_return_ma: 131  strength: 109.14  max_strength: 178.14  final_strength: 155.14  sample_efficiency: 0.000211029  training_efficiency: 5.29295e-06  stability: 0.941969
[2021-07-24 11:17:15,534 PID:27435 INFO __init__.py log_summary] Trial 0 session 1 dqn_cartpole_t0_s1 [train_df] epi: 124  t: 69  wall_t: 2584  opt_step: 318720  frame: 8000  fps: 3.09598  total_reward: 200  total_reward_ma: 135.312  loss: 3.20369  lr: 0.00956594  explore_var: 0.1  entropy_coef: nan  entropy: nan  grad_norm: nan
[2021-07-24 11:17:15,546 PID:27435 INFO __init__.py log_metrics] Trial 0 session 1 dqn_cartpole_t0_s1 [train_df metrics] final_return_ma: 135.312  strength: 113.452  max_strength: 178.14  final_strength: 178.14  sample_efficiency: 0.000202587  training_efficiency: 5.08143e-06  stability: 0.947468
[2021-07-24 11:19:58,220 PID:27435 INFO __init__.py log_summary] Trial 0 session 1 dqn_cartpole_t0_s1 [train_df] epi: 127  t: 106  wall_t: 2746  opt_step: 338720  frame: 8500  fps: 3.09541  total_reward: 152  total_reward_ma: 136.294  loss: 1.04083  lr: 0.00860934  explore_var: 0.1  entropy_coef: nan  entropy: nan  grad_norm: nan
[2021-07-24 11:19:58,234 PID:27435 INFO __init__.py log_metrics] Trial 0 session 1 dqn_cartpole_t0_s1 [train_df metrics] final_return_ma: 136.294  strength: 114.434  max_strength: 178.14  final_strength: 130.14  sample_efficiency: 0.000196904  training_efficiency: 4.939e-06  stability: 0.926181
[2021-07-24 11:22:41,174 PID:27435 INFO __init__.py log_summary] Trial 0 session 1 dqn_cartpole_t0_s1 [train_df] epi: 130  t: 35  wall_t: 2909  opt_step: 358720  frame: 9000  fps: 3.09385  total_reward: 183  total_reward_ma: 138.889  loss: 2.89101  lr: 0.00860934  explore_var: 0.1  entropy_coef: nan  entropy: nan  grad_norm: nan
[2021-07-24 11:22:41,183 PID:27435 INFO __init__.py log_metrics] Trial 0 session 1 dqn_cartpole_t0_s1 [train_df metrics] final_return_ma: 138.889  strength: 117.029  max_strength: 178.14  final_strength: 161.14  sample_efficiency: 0.000190341  training_efficiency: 4.77443e-06  stability: 0.931119
[2021-07-24 11:25:24,265 PID:27435 INFO __init__.py log_summary] Trial 0 session 1 dqn_cartpole_t0_s1 [train_df] epi: 133  t: 1  wall_t: 3072  opt_step: 378720  frame: 9500  fps: 3.09245  total_reward: 180  total_reward_ma: 141.053  loss: 5.7599  lr: 0.00774841  explore_var: 0.1  entropy_coef: nan  entropy: nan  grad_norm: nan
[2021-07-24 11:25:24,274 PID:27435 INFO __init__.py log_metrics] Trial 0 session 1 dqn_cartpole_t0_s1 [train_df metrics] final_return_ma: 141.053  strength: 119.193  max_strength: 178.14  final_strength: 158.14  sample_efficiency: 0.000184401  training_efficiency: 4.62542e-06  stability: 0.934964
[2021-07-24 11:27:56,455 PID:27435 INFO __init__.py log_summary] Trial 0 session 1 dqn_cartpole_t0_s1 [train_df] epi: 135  t: 156  wall_t: 3224  opt_step: 398720  frame: 10000  fps: 3.10174  total_reward: 174  total_reward_ma: 142.7  loss: 0.364511  lr: 0.00774841  explore_var: 0.1  entropy_coef: nan  entropy: nan  grad_norm: nan
[2021-07-24 11:27:56,466 PID:27435 INFO __init__.py log_metrics] Trial 0 session 1 dqn_cartpole_t0_s1 [train_df metrics] final_return_ma: 142.7  strength: 120.84  max_strength: 178.14  final_strength: 152.14  sample_efficiency: 0.000179087  training_efficiency: 4.49212e-06  stability: 0.936856
[2021-07-24 11:27:59,648 PID:27435 INFO __init__.py log_metrics] Trial 0 session 1 dqn_cartpole_t0_s1 [eval_df metrics] final_return_ma: 142.7  strength: 120.84  max_strength: 178.14  final_strength: 152.14  sample_efficiency: 0.000179087  training_efficiency: 4.49212e-06  stability: 0.936856
[2021-07-24 11:27:59,649 PID:27435 INFO logger.py info] Session 1 done

This is nearly one hour of running on 128 cores (apparently all are used) and then ultimately failing to achieve the pass score of 195. Could the slowness be explained by using all the CPUs and spending a lot of time on syncing?

Error logs

../../sandbox/linux/seccomp-bpf-helpers/sigsys_handlers.cc:**CRASHING**:seccomp-bpf failure in syscall 0230
kengz commented

seems like a glibc problem. Could you try upgrading/downgrading to a different version? similar issue discussed here rstudio/rstudio#6379 (comment)

kengz commented

closing issue as stale