basho/riak_dt

Add tests for riak_dt to riak_test

russelldb opened this issue · 15 comments

riak_dt will need testing. Add tests to riak_test.

Please see basho/riak_test#56 for a new commit.

And you'll need #40 for the tests to pass.

Mine keeps failing waiting for the cluster to heal.

15:17:56.569 [warning] verify_counter_converge failed: {{assertEqual_failed,[{module,rt},{line,340},{expression,"wait_until ( Node , F )"},{expected,ok},{value,fail}]},[{rt,'-wait_until_connected/1-fun-1-',3,[{file,"src/rt.erl"},{line,340}]},{rt,'-wait_until_connected/1-lc$^0/1-0-',2,[{file,"src/rt.erl"},{line,340}]},{rt,wait_until_connected,1,[{file,"src/rt.erl"},{line,340}]},{rt,heal,1,[{file,"src/rt.erl"},{line,189}]},{verify_counter_converge,confirm,0,[{file,"tests/verify_counter_converge.erl"},{line,48}]}]}
15:17:56.569 [error] Error in process <0.82.0> on node 'riak_test@127.0.0.1' with exit value: {{assertEqual_failed,[{module,rt},{line,340},{expression,"wait_until ( Node , F )"},{expected,ok},{value,fail}]},[{rt,'-wait_until_connected/1-fun-1-',3,[{file,"src/rt.erl"},{line,340}]},{rt,'-wait_until_connected/1-lc$^0/1-0-'... 

Do you have this PR http//github.com//pull/40 ??

Yes, merged that into riak_dt.

Then I'm seriously confused 'cos I can't get it to fail. Will look at tomorrow, I shut my big box down already.

I'll give it another go, checking out my build.

Worry not, mate, I will figure out.

@rzezeski and I have a suspicion: setting the cookie on those nodes being partitioned makes them inaccessible to the riak_test node, meaning that the rpc:call to reconnect won't work, or will break connection to the ones that didn't have their cookies changed.

Maybe we could send the node watcher a "node down" for the nodes to be partitioned off?

Maybe, I don't know. It works for me. There is no disconnect call between riak_test and the first partition so they still communicate (in my experience.)

Will try what you suggest, but it is not a real partition then.

Why doesn't the test fail here https://github.com/basho/riak_test/pull/56/files#L0R188 rather than in wait_until_connected (if you and Ryan have a correct suspicion?)

Weird, I finally get this failing now, after updating to riak master. OK, now I know it is my problem, re blocking card.

Ok, the reason the cluster never healed is 'riak_test' node was not a hidden node, so the list of connected nodes in the partition heal list always had one element to many.

Ideally riak_test would be hidden, instead I made a commit to basho/riak_test#56 that resolves the issue.

Please pull that and retest this.

👍 wfm now