All I see is `status code:500 error message:can't get the public key from storage`
craigpastro opened this issue · 7 comments
I tried running a dl test as lein run test --test cas --ssh-private-key ~/.ssh/id_rsa
in Azure and basically all that I see is
INFO [2019-10-10 05:40:12,665] jepsen worker 2 - jepsen.util 2 :invoke :cas [1 0]
INFO [2019-10-10 05:40:12,666] jepsen worker 3 - jepsen.util 3 :fail :cas [4 0] status code:500 error message:can't get the public key from storage
INFO [2019-10-10 05:40:12,666] jepsen worker 3 - jepsen.util 3 :invoke :write 2
INFO [2019-10-10 05:40:12,669] jepsen worker 1 - jepsen.util 1 :fail :write 1 status code:500 error message:can't get the public key from storage
INFO [2019-10-10 05:40:12,669] jepsen worker 1 - jepsen.util 1 :invoke :cas [1 3]
INFO [2019-10-10 05:40:12,672] jepsen worker 4 - jepsen.util 4 :fail :write 4 status code:500 error message:can't get the public key from storage
INFO [2019-10-10 05:40:12,672] jepsen worker 4 - jepsen.util 4 :invoke :cas [3 0]
INFO [2019-10-10 05:40:12,673] jepsen worker 2 - jepsen.util 2 :fail :cas [1 0] status code:500 error message:can't get the public key from storage
INFO [2019-10-10 05:40:12,673] jepsen worker 2 - jepsen.util 2 :invoke :cas [3 2]
INFO [2019-10-10 05:40:12,674] jepsen worker 0 - jepsen.util 0 :fail :write 2 status code:500 error message:can't get the public key from storage
INFO [2019-10-10 05:40:12,674] jepsen worker 0 - jepsen.util 0 :invoke :cas [4 2]
INFO [2019-10-10 05:40:12,677] jepsen worker 3 - jepsen.util 3 :fail :write 2 status code:500 error message:can't get the public key from storage
INFO [2019-10-10 05:40:12,677] jepsen worker 3 - jepsen.util 3 :invoke :write 0
INFO [2019-10-10 05:40:12,680] jepsen worker 4 - jepsen.util 4 :fail :cas [3 0] status code:500 error message:can't get the public key from storage
INFO [2019-10-10 05:40:12,681] jepsen worker 4 - jepsen.util 4 :invoke :write 1
...
then the test ends with a java.util.concurrent.ExecutionException: java.io.IOException: No such file or directory
@siyopao The log occurs when a certificate is not registered. Do you see registerCertificate failure ?
We changed the port for registerCertificate so that might be related.
I can't tell. The logs go:
2019-10-10 05:40:02,932{GMT} INFO [jepsen worker 3] scalardl.cas: register a certificate and contracts
2019-10-10 05:40:03,593{GMT} INFO [jepsen worker 0] jepsen.core: Running worker 0
2019-10-10 05:40:03,594{GMT} INFO [jepsen nemesis] jepsen.core: Running nemesis
2019-10-10 05:40:03,595{GMT} INFO [jepsen worker 1] jepsen.core: Running worker 1
...
and the tests continue.
OK, hmm, maybe it's better to remove the DL test for now. Can you do that ?
It is not running yet. I was just testing for https://github.com/scalar-labs/scalar/pull/277.
Oh, OK...
I guess that registerCertificate fails and the caller doesn't check the status of it.
@feeblefakie Sorry for the late response.
You are right. The caller doesn't check the status of the registration.
scalar-jepsen/scalardl/src/scalardl/cas.clj
Lines 62 to 64 in b11246d
I will check the port for registerCertificate
and add status checks for registrations for a certificate and contracts.