Issues with running the testbed transfer plan and RFC experiments
Opened this issue · 0 comments
Hey there.
I'm having issues with running the testground test plans. I followed the instructions in the testbed README to install everything but have issues with running the transfer
testplan and the RFC experiment test plans.
When running the single_run.sh
script, the nodes are hanging at "waiting for network to initialize" and the test is timing out with "context deadline exceeded". (I modified the script to use RUNNER="local:docker" and BUILDER="docker:go" but exec also does not work for me.)
Step 28/28 : ENTRYPOINT [ "/testplan"]
---> Running in a644b3cfb5ba
---> 900b91fa9f0c
---> 900b91fa9f0c
Successfully built 900b91fa9f0c
Successfully tagged 26327f500441:latest
Jul 9 17:59:44.075299 INFO build completed {"default_tag": "26327f500441:latest", "took": "38s"}
Jul 9 17:59:44.128650 INFO got docker image id {"image_id": "900b91fa9f0c"}
Jul 9 17:59:44.181896 INFO tagging image {"image_id": "900b91fa9f0c", "tag": "tg-plan-testbed:900b91fa9f0c"}
Jul 9 17:59:44.187222 INFO build succeeded {"plan": "testbed", "groups": ["single"], "builder": "docker:go", "artifact": "900b91fa9f0c"}
Jul 9 17:59:44.187312 INFO performing healthcheck on runner
Jul 9 17:59:44.194357 INFO container isn't running; starting {"container_name": "testground-grafana"}
Jul 9 17:59:44.582836 INFO container isn't running; starting {"container_name": "testground-influxdb"}
Jul 9 17:59:44.933611 WARN some healthchecks failed, but continuing
Jul 9 17:59:44.933668 INFO starting run {"run_id": "cb4s3q3ft8ahdoftmlb0", "plan": "testbed", "case": "transfer", "runner": "local:docker", "instances": 5}
Jul 9 17:59:44.968129 INFO creating container {"runner": "local:docker", "run_id": "cb4s3q3ft8ahdoftmlb0", "name": "tg-testbed-transfer-cb4s3q3ft8ahdoftmlb0-single-0"}
Jul 9 17:59:45.085128 INFO creating container {"runner": "local:docker", "run_id": "cb4s3q3ft8ahdoftmlb0", "name": "tg-testbed-transfer-cb4s3q3ft8ahdoftmlb0-single-1"}
Jul 9 17:59:45.129046 INFO creating container {"runner": "local:docker", "run_id": "cb4s3q3ft8ahdoftmlb0", "name": "tg-testbed-transfer-cb4s3q3ft8ahdoftmlb0-single-2"}
Jul 9 17:59:45.169380 INFO creating container {"runner": "local:docker", "run_id": "cb4s3q3ft8ahdoftmlb0", "name": "tg-testbed-transfer-cb4s3q3ft8ahdoftmlb0-single-3"}
Jul 9 17:59:45.198084 INFO creating container {"runner": "local:docker", "run_id": "cb4s3q3ft8ahdoftmlb0", "name": "tg-testbed-transfer-cb4s3q3ft8ahdoftmlb0-single-4"}
Jul 9 17:59:45.229807 INFO starting containers {"runner": "local:docker", "run_id": "cb4s3q3ft8ahdoftmlb0", "count": 5}
Jul 9 17:59:45.229876 INFO starting container {"runner": "local:docker", "run_id": "cb4s3q3ft8ahdoftmlb0", "id": "8fb6df27ae5e1ecae505ed6634837ed3ad1be118c3a83d7bbe87ca5e3c0e253e", "group": "single", "group_index": 0}
Jul 9 17:59:45.229873 INFO starting container {"runner": "local:docker", "run_id": "cb4s3q3ft8ahdoftmlb0", "id": "965d0784d7ae662642d72f95fc560746531ebde38f74821fa65ad547b239887f", "group": "single", "group_index": 4}
Jul 9 17:59:45.229873 INFO starting container {"runner": "local:docker", "run_id": "cb4s3q3ft8ahdoftmlb0", "id": "4f0e47e5a6af633e5c265866dde91b3828e0922509b8a5d105f85bdc1f58bd02", "group": "single", "group_index": 3}
Jul 9 17:59:45.229878 INFO starting container {"runner": "local:docker", "run_id": "cb4s3q3ft8ahdoftmlb0", "id": "b86bc557c07dd42f91e21c45a179d894db0a2b57f20bfaa3881012337328c22c", "group": "single", "group_index": 1}
Jul 9 17:59:45.229923 INFO starting container {"runner": "local:docker", "run_id": "cb4s3q3ft8ahdoftmlb0", "id": "93a8aeb43e8ef9293dac1465e8b261bfbfe591263d4c91cd3516c845957cfa89", "group": "single", "group_index": 2}
Jul 9 17:59:45.954264 INFO 0.7244s OTHER << sidecar >> Jul 9 17:59:45.953718 INFO resolved route to host {"host": "testground-redis", "ip": "192.18.0.3"}
Jul 9 17:59:45.954327 INFO 0.7245s OTHER << sidecar >> Jul 9 17:59:45.953920 INFO resolved route to host {"host": "testground-sync-service", "ip": "192.18.0.4"}
Jul 9 17:59:45.954344 INFO 0.7245s OTHER << sidecar >> Jul 9 17:59:45.954105 INFO resolved route to host {"host": "testground-influxdb", "ip": "192.18.0.6"}
Jul 9 17:59:45.954711 INFO 0.7245s MESSAGE << sidecar >> InfluxDB unavailable; no metrics will be dispatched: no InfluxDB URL in $INFLUXDB_URL env var
Jul 9 17:59:45.958716 INFO 0.7288s OTHER << sidecar >> Jul 9 17:59:45.958406 INFO successfully resolved route to host {"container_id": "8fb6df27ae5e1ecae505ed6634837ed3ad1be118c3a83d7bbe87ca5e3c0e253e"}
Jul 9 17:59:45.968329 INFO 0.7371s MESSAGE << single[000] (8fb6df) >> registering default http handler at: http://[::]:6060/ (pprof: http://[::]:6060/debug/pprof/)
Jul 9 17:59:45.968576 INFO 0.7371s START << single[000] (8fb6df) >> {"plan":"testbed","case":"transfer","run":"cb4s3q3ft8ahdoftmlb0","outputs_path":"/outputs","instances":5,"params":{"bandwidth_mb":"150","bstore_delay_ms":"5","data_dir":"false","dialer":"default","disk_store":"false","enable_dht":"false","enable_providing":"false","enable_tcp":"100","exchange_interface":"bitswap","file_size":"15728640","input_data":"files","jitter_pct":"10","latency_ms":"10","leech_count":"3","long_lasting":"false","max_connection_rate":"0","node_type":"ipfs","number_waves":"1","parallel_gen_mb":"100","passive_count":"","request_stagger":"0","run_count":"2","run_timeout_secs":"90000","seeder_rate":"100","timeout_secs":"400000"},"group":"single","group_instances":5,"network":"16.12.0.0/16","start_time":"0001-01-01T00:00:00Z"}
Jul 9 17:59:45.973722 INFO 0.7438s OTHER << sidecar >> Jul 9 17:59:45.973444 INFO external routing disabled
Jul 9 17:59:45.973806 INFO 0.7439s OTHER << sidecar >> Jul 9 17:59:45.973498 INFO waiting for all networks to be ready {"sidecar": true, "run_id": "cb4s3q3ft8ahdoftmlb0"}
Jul 9 17:59:45.973959 INFO 0.7438s MESSAGE << single[000] (8fb6df) >> waiting for network to initialize
Jul 9 17:59:46.296012 INFO 1.0659s MESSAGE << sidecar >> InfluxDB unavailable; no metrics will be dispatched: no InfluxDB URL in $INFLUXDB_URL env var
Jul 9 17:59:46.299944 INFO 1.0698s MESSAGE << single[001] (b86bc5) >> registering default http handler at: http://[::]:6060/ (pprof: http://[::]:6060/debug/pprof/)
Jul 9 17:59:46.300083 INFO 1.0698s START << single[001] (b86bc5) >> {"plan":"testbed","case":"transfer","run":"cb4s3q3ft8ahdoftmlb0","outputs_path":"/outputs","instances":5,"params":{"bandwidth_mb":"150","bstore_delay_ms":"5","data_dir":"false","dialer":"default","disk_store":"false","enable_dht":"false","enable_providing":"false","enable_tcp":"100","exchange_interface":"bitswap","file_size":"15728640","input_data":"files","jitter_pct":"10","latency_ms":"10","leech_count":"3","long_lasting":"false","max_connection_rate":"0","node_type":"ipfs","number_waves":"1","parallel_gen_mb":"100","passive_count":"","request_stagger":"0","run_count":"2","run_timeout_secs":"90000","seeder_rate":"100","timeout_secs":"400000"},"group":"single","group_instances":5,"network":"16.12.0.0/16","start_time":"0001-01-01T00:00:00Z"}
Jul 9 17:59:46.301369 INFO 1.0710s MESSAGE << single[001] (b86bc5) >> waiting for network to initialize
Jul 9 17:59:46.446754 INFO 1.2168s OTHER << sidecar >> Jul 9 17:59:46.445748 INFO successfully resolved route to host {"container_id": "b86bc557c07dd42f91e21c45a179d894db0a2b57f20bfaa3881012337328c22c"}
Jul 9 17:59:46.491011 INFO 1.2610s MESSAGE << sidecar >> InfluxDB unavailable; no metrics will be dispatched: no InfluxDB URL in $INFLUXDB_URL env var
Jul 9 17:59:46.496605 INFO 1.2664s MESSAGE << single[002] (93a8ae) >> registering default http handler at: http://[::]:6060/ (pprof: http://[::]:6060/debug/pprof/)
Jul 9 17:59:46.496757 INFO 1.2664s START << single[002] (93a8ae) >> {"plan":"testbed","case":"transfer","run":"cb4s3q3ft8ahdoftmlb0","outputs_path":"/outputs","instances":5,"params":{"bandwidth_mb":"150","bstore_delay_ms":"5","data_dir":"false","dialer":"default","disk_store":"false","enable_dht":"false","enable_providing":"false","enable_tcp":"100","exchange_interface":"bitswap","file_size":"15728640","input_data":"files","jitter_pct":"10","latency_ms":"10","leech_count":"3","long_lasting":"false","max_connection_rate":"0","node_type":"ipfs","number_waves":"1","parallel_gen_mb":"100","passive_count":"","request_stagger":"0","run_count":"2","run_timeout_secs":"90000","seeder_rate":"100","timeout_secs":"400000"},"group":"single","group_instances":5,"network":"16.12.0.0/16","start_time":"0001-01-01T00:00:00Z"}
Jul 9 17:59:46.497784 INFO 1.2677s MESSAGE << single[002] (93a8ae) >> waiting for network to initialize
Jul 9 17:59:46.508462 INFO 1.2784s MESSAGE << sidecar >> InfluxDB unavailable; no metrics will be dispatched: no InfluxDB URL in $INFLUXDB_URL env var
Jul 9 17:59:46.512022 INFO 1.2819s MESSAGE << single[003] (4f0e47) >> registering default http handler at: http://[::]:6060/ (pprof: http://[::]:6060/debug/pprof/)
Jul 9 17:59:46.512133 INFO 1.2819s START << single[003] (4f0e47) >> {"plan":"testbed","case":"transfer","run":"cb4s3q3ft8ahdoftmlb0","outputs_path":"/outputs","instances":5,"params":{"bandwidth_mb":"150","bstore_delay_ms":"5","data_dir":"false","dialer":"default","disk_store":"false","enable_dht":"false","enable_providing":"false","enable_tcp":"100","exchange_interface":"bitswap","file_size":"15728640","input_data":"files","jitter_pct":"10","latency_ms":"10","leech_count":"3","long_lasting":"false","max_connection_rate":"0","node_type":"ipfs","number_waves":"1","parallel_gen_mb":"100","passive_count":"","request_stagger":"0","run_count":"2","run_timeout_secs":"90000","seeder_rate":"100","timeout_secs":"400000"},"group":"single","group_instances":5,"network":"16.12.0.0/16","start_time":"0001-01-01T00:00:00Z"}
Jul 9 17:59:46.512874 INFO 1.2829s MESSAGE << single[003] (4f0e47) >> waiting for network to initialize
Jul 9 17:59:46.515609 INFO 1.2857s OTHER << sidecar >> Jul 9 17:59:46.515389 INFO successfully resolved route to host {"container_id": "93a8aeb43e8ef9293dac1465e8b261bfbfe591263d4c91cd3516c845957cfa89"}
Jul 9 17:59:46.515665 INFO 1.2858s OTHER << sidecar >> Jul 9 17:59:46.515429 INFO external routing disabled
Jul 9 17:59:46.521781 INFO 1.2919s OTHER << sidecar >> Jul 9 17:59:46.515445 INFO successfully resolved route to host {"container_id": "4f0e47e5a6af633e5c265866dde91b3828e0922509b8a5d105f85bdc1f58bd02"}
Jul 9 17:59:46.521840 INFO 1.2920s OTHER << sidecar >> Jul 9 17:59:46.515653 INFO waiting for all networks to be ready {"sidecar": true, "run_id": "cb4s3q3ft8ahdoftmlb0"}
Jul 9 17:59:46.521861 INFO 1.2920s OTHER << sidecar >> Jul 9 17:59:46.515979 INFO external routing disabled
Jul 9 17:59:46.521872 INFO 1.2920s OTHER << sidecar >> Jul 9 17:59:46.516001 INFO external routing disabled
Jul 9 17:59:46.521881 INFO 1.2920s OTHER << sidecar >> Jul 9 17:59:46.516056 INFO waiting for all networks to be ready {"sidecar": true, "run_id": "cb4s3q3ft8ahdoftmlb0"}
Jul 9 17:59:46.521890 INFO 1.2920s OTHER << sidecar >> Jul 9 17:59:46.516057 INFO waiting for all networks to be ready {"sidecar": true, "run_id": "cb4s3q3ft8ahdoftmlb0"}
Jul 9 17:59:46.542666 INFO started containers {"runner": "local:docker", "run_id": "cb4s3q3ft8ahdoftmlb0", "count": 5}
Jul 9 17:59:46.544171 INFO 1.3140s MESSAGE << sidecar >> InfluxDB unavailable; no metrics will be dispatched: no InfluxDB URL in $INFLUXDB_URL env var
Jul 9 17:59:46.544949 INFO 1.3151s OTHER << sidecar >> Jul 9 17:59:46.544804 INFO successfully resolved route to host {"container_id": "965d0784d7ae662642d72f95fc560746531ebde38f74821fa65ad547b239887f"}
Jul 9 17:59:46.545183 INFO 1.3153s OTHER << sidecar >> Jul 9 17:59:46.544991 INFO external routing disabled
Jul 9 17:59:46.545233 INFO 1.3154s OTHER << sidecar >> Jul 9 17:59:46.545031 INFO waiting for all networks to be ready {"sidecar": true, "run_id": "cb4s3q3ft8ahdoftmlb0"}
Jul 9 17:59:46.545571 INFO 1.3157s OTHER << sidecar >> Jul 9 17:59:46.545424 INFO all networks ready {"sidecar": true, "run_id": "cb4s3q3ft8ahdoftmlb0"}
Jul 9 17:59:46.545762 INFO 1.3159s OTHER << sidecar >> Jul 9 17:59:46.545439 INFO all networks ready {"sidecar": true, "run_id": "cb4s3q3ft8ahdoftmlb0"}
Jul 9 17:59:46.545791 INFO 1.3159s OTHER << sidecar >> Jul 9 17:59:46.545455 INFO all networks ready {"sidecar": true, "run_id": "cb4s3q3ft8ahdoftmlb0"}
Jul 9 17:59:46.545801 INFO 1.3159s OTHER << sidecar >> Jul 9 17:59:46.545455 INFO all networks ready {"sidecar": true, "run_id": "cb4s3q3ft8ahdoftmlb0"}
Jul 9 17:59:46.545809 INFO 1.3160s OTHER << sidecar >> Jul 9 17:59:46.545639 INFO all networks ready {"sidecar": true, "run_id": "cb4s3q3ft8ahdoftmlb0"}
Jul 9 17:59:46.549816 INFO 1.3196s MESSAGE << single[004] (965d07) >> registering default http handler at: http://[::]:6060/ (pprof: http://[::]:6060/debug/pprof/)
Jul 9 17:59:46.549959 INFO 1.3196s START << single[004] (965d07) >> {"plan":"testbed","case":"transfer","run":"cb4s3q3ft8ahdoftmlb0","outputs_path":"/outputs","instances":5,"params":{"bandwidth_mb":"150","bstore_delay_ms":"5","data_dir":"false","dialer":"default","disk_store":"false","enable_dht":"false","enable_providing":"false","enable_tcp":"100","exchange_interface":"bitswap","file_size":"15728640","input_data":"files","jitter_pct":"10","latency_ms":"10","leech_count":"3","long_lasting":"false","max_connection_rate":"0","node_type":"ipfs","number_waves":"1","parallel_gen_mb":"100","passive_count":"","request_stagger":"0","run_count":"2","run_timeout_secs":"90000","seeder_rate":"100","timeout_secs":"400000"},"group":"single","group_instances":5,"network":"16.12.0.0/16","start_time":"0001-01-01T00:00:00Z"}
Jul 9 17:59:46.551114 INFO 1.3210s MESSAGE << single[004] (965d07) >> waiting for network to initialize
Jul 9 18:09:05.237973 ERROR context.Background.WithDeadline(2022-07-09 20:09:05.236917778 +0200 CEST m=+615.193319498 [-1.052737ms]) {"runner": "local:docker", "run_id": "cb4s3q3ft8ahdoftmlb0"}
Jul 9 18:09:05.237992 INFO 560.0081s INCOMPLETE << single[001] (b86bc5) >>
Jul 9 18:09:05.238023 INFO 560.0082s ERROR << single[002] (93a8ae) >> context deadline exceeded
Jul 9 18:09:05.238005 INFO 560.0081s ERROR << single[001] (b86bc5) >> context deadline exceeded
Jul 9 18:09:05.238031 INFO 560.0082s INCOMPLETE << single[002] (93a8ae) >>
Jul 9 18:09:05.237974 INFO deleting containers {"runner": "local:docker", "run_id": "cb4s3q3ft8ahdoftmlb0", "ids": ["8fb6df27ae5e1ecae505ed6634837ed3ad1be118c3a83d7bbe87ca5e3c0e253e", "b86bc557c07dd42f91e21c45a179d894db0a2b57f20bfaa3881012337328c22c", "93a8aeb43e8ef9293dac1465e8b261bfbfe591263d4c91cd3516c845957cfa89", "4f0e47e5a6af633e5c265866dde91b3828e0922509b8a5d105f85bdc1f58bd02", "965d0784d7ae662642d72f95fc560746531ebde38f74821fa65ad547b239887f"]}
Jul 9 18:09:05.238038 INFO 560.0082s ERROR << single[000] (8fb6df) >> context deadline exceeded
Jul 9 18:09:05.238048 INFO 560.0082s INCOMPLETE << sidecar >>
Jul 9 18:09:05.238055 INFO 560.0082s ERROR << single[004] (965d07) >> context deadline exceeded
Jul 9 18:09:05.238054 INFO 560.0082s INCOMPLETE << single[000] (8fb6df) >>
Jul 9 18:09:05.238062 INFO 560.0082s INCOMPLETE << single[004] (965d07) >>
Jul 9 18:09:05.238067 INFO 560.0082s INCOMPLETE << single[003] (4f0e47) >>
Jul 9 18:09:05.238082 INFO deleting container {"runner": "local:docker", "run_id": "cb4s3q3ft8ahdoftmlb0", "id": "965d0784d7ae662642d72f95fc560746531ebde38f74821fa65ad547b239887f"}
Jul 9 18:09:05.238082 INFO deleting container {"runner": "local:docker", "run_id": "cb4s3q3ft8ahdoftmlb0", "id": "8fb6df27ae5e1ecae505ed6634837ed3ad1be118c3a83d7bbe87ca5e3c0e253e"}
Jul 9 18:09:05.238082 INFO deleting container {"runner": "local:docker", "run_id": "cb4s3q3ft8ahdoftmlb0", "id": "b86bc557c07dd42f91e21c45a179d894db0a2b57f20bfaa3881012337328c22c"}
Jul 9 18:09:05.238089 INFO deleting container {"runner": "local:docker", "run_id": "cb4s3q3ft8ahdoftmlb0", "id": "4f0e47e5a6af633e5c265866dde91b3828e0922509b8a5d105f85bdc1f58bd02"}
Jul 9 18:09:05.238090 INFO deleting container {"runner": "local:docker", "run_id": "cb4s3q3ft8ahdoftmlb0", "id": "93a8aeb43e8ef9293dac1465e8b261bfbfe591263d4c91cd3516c845957cfa89"}
Jul 9 18:09:05.238055 INFO 560.0082s ERROR << single[003] (4f0e47) >> context deadline exceeded
Jul 9 18:09:06.951796 WARN run finished in error {"run_id": "cb4s3q3ft8ahdoftmlb0", "plan": "testbed", "case": "transfer", "runner": "local:docker", "instances": 5, "error": "context deadline exceeded"}
Jul 9 18:09:06.951987 ERROR doRun returned err {"err": "context deadline exceeded"}
Jul 9 18:09:06.959021 INFO worker completed task {"worker_id": 0, "task_id": "cb4s3q3ft8ahdoftmlb0"}
When running an RFC experiment with ./run_experiment.sh rfcBBL102
I get the following error:
Step 19/26 : RUN cp /tmp/go.mod /tmp/go.sum ${PLAN_DIR}/
---> Running in 101d095bf46b
---> 57f8c2ec9daf
Step 20/26 : RUN cd ${PLAN_DIR} && go env -w GOPROXY="${GO_PROXY}" && CGO_ENABLED=${CgoEnabled} GOOS=linux GOARCH=amd64 go build -o ${PLAN_DIR}/testplan.bin ${BUILD_TAGS} ${TESTPLAN_EXEC_PKG}
---> Running in 4a7bffce0cbd
utils/bitswap.go:9:2: missing go.sum entry for module providing package github.com/ipfs/go-bitswap (imported by github.com/protocol/beyond-bitswap/testbed/testbed/utils); to add:
go get github.com/protocol/beyond-bitswap/testbed/testbed/utils
utils/bitswap.go:10:2: missing go.sum entry for module providing package github.com/ipfs/go-bitswap/network (imported by github.com/protocol/beyond-bitswap/testbed/testbed/utils); to add:
go get github.com/protocol/beyond-bitswap/testbed/testbed/utils
Jul 6 17:20:04.942631 INFO build failed {"plan": "testbed", "groups": ["nodes"], "builder": "docker:go", "error": "docker build failed: The command '/bin/sh -c cd ${PLAN_DIR} && go env -w GOPROXY=\"${GO_PROXY}\" && CGO_ENABLED=${CgoEnabled} GOOS=linux GOARCH=amd64 go build -o ${PLAN_DIR}/testplan.bin ${BUILD_TAGS} ${TESTPLAN_EXEC_PKG}' returned a non-zero code: 1"}
Jul 6 17:20:04.942677 ERROR doRun returned err {"err": "docker build failed: The command '/bin/sh -c cd ${PLAN_DIR} && go env -w GOPROXY=\"${GO_PROXY}\" && CGO_ENABLED=${CgoEnabled} GOOS=linux GOARCH=amd64 go build -o ${PLAN_DIR}/testplan.bin ${BUILD_TAGS} ${TESTPLAN_EXEC_PKG}' returned a non-zero code: 1"}
I tried to fix the issues with go mod tidy
but it didn't help. I'm locally running go version 1.18.3 but since I'm using the dockers runners this should not matter I guess (?)
I would highly appreciate help and tips pointing me in the right direction.