scalar-labs/scalar-jepsen

Cassandra node start failed

Jiao-05 opened this issue · 5 comments

I want to test LWT in Cassandra by jespen

I run this Jepsen test on the virtual machine, and my configuration is Ubuntu 20 04,docker version is 20.10.12 and docker-compose version is 1.25.0.

At first, I ran the test and found that n1-n5 installing openjdk JRE always tried again until it failed.

So I modified the code a little like this

05

I replaced the content in the red box with the code on the figure below

04

Will my changes have an impact on error?

However, an error will be reported later:
03
This seems to be DB nodes start fail

How can I handle this error so that I can run tests?

@Jiao-05 Sorry for the late reply.
You may find the error in the Cassandra logs. The Cassandra log is stored in a DB node such as n1, not the control node.

I want to test LWT in Cassandra by jespen

I run this Jepsen test on the virtual machine, and my configuration is Ubuntu 20 04,docker version is 20.10.12 and docker-compose version is 1.25.0.

At first, I ran the test and found that n1-n5 installing openjdk JRE always tried again until it failed.

So I modified the code a little like this

05

I replaced the content in the red box with the code on the figure below
04

Will my changes have an impact on error?

However, an error will be reported later: 03 This seems to be DB nodes start fail

How can I handle this error so that I can run tests?

Have you solved this problem? I replace it with open-11-jre and the same error reported

@Tsunaou Thank you for your report. Could you share the error logs of the Cassandra node? system.log or debug.log should be at /root/cassandra/logs and you would find the error log of the bootstrap failure.

@yito88 Oh I have partially solved this problem, see this commit log.
However, the wait-ready method could sometime timeout and I don't know why it happens: node ni waiting nj for a long time. It will still take about 3 minutes to start a 5-node cluster if it finishes waiting successfully. It is strange.

Thanks!
Taking 3 minutes sounds normal. We have to start a Cassandra node one by one. The time starting a node takes 30 seconds or longer.
When a Cassandra node has lots of commitlogs that should be persistent, the bootstrap time would be longer.