Intermittent E2E test failure: Elasticsearch pilot did not update the document count
wallrj opened this issue · 1 comments
wallrj commented
In #277 I keep getting ElasticSearch E2E test failures:
- https://jetstack-build-infra.appspot.com/build/jetstack-logs/pr-logs/pull/jetstack_navigator/277/navigator-e2e-v1-7/1103/
- https://jetstack-build-infra.appspot.com/build/jetstack-logs/pr-logs/pull/jetstack_navigator/277/navigator-e2e-v1-8/1117/
- https://jetstack-build-infra.appspot.com/build/jetstack-logs/pr-logs/pull/jetstack_navigator/277/navigator-e2e-v1-9/658/
ERROR: [1] bootstrap checks failed
[1]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
[2018-04-05T09:53:11,796][INFO ][o.e.n.Node ] [es-test-mixed-0] stopping ...
Maybe the daemonset introduced in #287 hasn't yet run?
/kind bug
wallrj commented
Ah, looks like things are getting stuck during prepare-e2e.sh
I0405 09:35:52.306] Waiting for tiller to be ready...
W0405 09:35:52.407] + echo 'Waiting for tiller to be ready...'
W0405 09:35:52.407] + retry TIMEOUT=60 helm version
W0405 09:35:52.407] + local TIMEOUT=60
W0405 09:35:52.407] + local SLEEP=10
W0405 09:35:52.407] + :
W0405 09:35:52.407] + case "${1}" in
W0405 09:35:52.408] + local TIMEOUT=60
W0405 09:35:52.408] + shift
W0405 09:35:52.408] + :
W0405 09:35:52.408] + case "${1}" in
W0405 09:35:52.408] + break
W0405 09:35:52.408] + local start_time
W0405 09:35:52.408] ++ date +%s
W0405 09:35:52.408] + start_time=1522920952
W0405 09:35:52.408] + local end_time
W0405 09:35:52.408] + end_time=1522921012
W0405 09:35:52.408] + helm version
I0405 09:35:52.509] Client: &version.Version{SemVer:"v2.8.2", GitCommit:"a80231648a1473929271764b920a8e346f6de844", GitTreeState:"clean"}
W0405 09:40:52.483] Error: cannot connect to Tiller
W0405 09:40:52.486] + local exit_code=1
W0405 09:40:52.486] ++ date +%s
W0405 09:40:52.487] + local current_time=1522921252
W0405 09:40:52.487] + local remaining_time=-240
W0405 09:40:52.487] + [[ -240 -le 0 ]]
W0405 09:40:52.487] + return 1
W0405 09:40:52.490] + exec
helm version
takes > 5 min to return.
But the timeout for this step is only 60s.
We can increase that timeout and / or add a time limit to the helm commands.
But also need to fix make e2e-test
so that it exits early if the prepare-e2e.sh script fails.