xline-kv/Xline

[Bug]: Failed to add a new member node4 to a three-nodes xline cluster

Closed this issue · 0 comments

Description about the bug

Xline cluster fails to execute member add.

Following the instructions in quick_start/README.md, executing the member add operation on a three-node Xline cluster results in the new node failing to start.

Reproduction steps are as follows:

  1. Start a three-node cluster and an etcd-client using quick-start.sh.
$ ./scripts/quick_start.sh 
 [INFO] stopping 
Error response from daemon: No such container: prometheus
 [INFO] stopped 
 [WARN] A Docker network named 'xline_net' is created for communication among various xline nodes. You can use the command 'docker network rm xline_net' to remove it after use. 
 [INFO] container starting 
ecdf4cce22a5ee00054802eb6478e07549f3dfb29497022930af01aa14ce2ce0
d378be09cdca037a6a5984bc411848e602c374e679bb9de1fbca0e8899ddda60
adceb1e6b9849e5a6f28e776abdf64d1dc38e4d3032b6f34d944c160a91a06ef
cf49683ddc0372fd8e9df331a479cce369d852aa161bd3e437fc7d980ef1ab9c
 [INFO] container started 
 [INFO] cluster starting 
 [INFO] command is: docker exec -e RUST_LOG=debug -d node3 /usr/local/bin/xline     --name node3     --members node1=172.20.0.3:2380,172.20.0.3:2381,node2=172.20.0.4:2380,172.20.0.4:2381,node3=172.20.0.5:2380,172.20.0.5:2381     --storage-engine rocksdb     --data-dir /usr/local/xline/data-dir     --auth-public-key /mnt/public.pem     --auth-private-key /mnt/private.pem     --client-listen-urls=http://172.20.0.5:2379     --peer-listen-urls=http://172.20.0.5:2380,http://172.20.0.5:2381     --client-advertise-urls=http://172.20.0.5:2379     --peer-advertise-urls=http://172.20.0.5:2380,http://172.20.0.5:2381 
 [INFO] command is: docker exec -e RUST_LOG=debug -d node1 /usr/local/bin/xline     --name node1     --members node1=172.20.0.3:2380,172.20.0.3:2381,node2=172.20.0.4:2380,172.20.0.4:2381,node3=172.20.0.5:2380,172.20.0.5:2381     --storage-engine rocksdb     --data-dir /usr/local/xline/data-dir     --auth-public-key /mnt/public.pem     --auth-private-key /mnt/private.pem     --client-listen-urls=http://172.20.0.3:2379     --peer-listen-urls=http://172.20.0.3:2380,http://172.20.0.3:2381     --client-advertise-urls=http://172.20.0.3:2379     --peer-advertise-urls=http://172.20.0.3:2380,http://172.20.0.3:2381 --is-leader 
 [INFO] command is: docker exec -e RUST_LOG=debug -d node2 /usr/local/bin/xline     --name node2     --members node1=172.20.0.3:2380,172.20.0.3:2381,node2=172.20.0.4:2380,172.20.0.4:2381,node3=172.20.0.5:2380,172.20.0.5:2381     --storage-engine rocksdb     --data-dir /usr/local/xline/data-dir     --auth-public-key /mnt/public.pem     --auth-private-key /mnt/private.pem     --client-listen-urls=http://172.20.0.4:2379     --peer-listen-urls=http://172.20.0.4:2380,http://172.20.0.4:2381     --client-advertise-urls=http://172.20.0.4:2379     --peer-advertise-urls=http://172.20.0.4:2380,http://172.20.0.4:2381 
 [INFO] cluster started 
ebdba0fd66f42ac9910276ac45cfdc187f90fb6e32a51f6d2799186a4b9e4119
Prometheus starts on http://172.20.0.6:9090/graph and http://127.0.0.1:9090/graph
  1. Use etcdctl to execute the member add operation.
$ docker exec client /bin/sh -c "/usr/local/bin/etcdctl --endpoints=\"http://172.20.0.3:2379\" member add node4 --peer-urls=http://172.20.0.17:2380,http://172.20.0.17:2381"
Member 80fff8e371b58d12 added to cluster 425b5b944b259215

ETCD_NAME="node4"
ETCD_INITIAL_CLUSTER="node4=http://172.20.0.17:2380,node4=http://172.20.0.17:2381,node2=172.20.0.4:2380,node2=172.20.0.4:2381,node1=172.20.0.3:2380,node1=172.20.0.3:2381,node3=172.20.0.5:2380,node3=172.20.0.5:2381"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://172.20.0.17:2380,http://172.20.0.17:2381"
ETCD_INITIAL_CLUSTER_STATE="existing"
  1. Boot up node4
$ docker run -d -it --rm --name=node4 --net=xline_net --ip=172.20.0.17 --cap-add=NET_ADMIN --cpu-shares=1024 -m=512M -v ./scripts:/mnt ghcr.io/xline-kv/xline:latest bash
f4818b022e21351ddea240d1c974db056363f4bc9247c163c70531ecf284147e
  1. Start up a new xline node
$ docker exec -it node4 /bin/bash
root@f4818b022e21:/# /usr/local/bin/xline --name node4 --members node1=172.20.0.3:2380,172.20.0.3:2381,node2=172.20.0.4:2380,172.20.0.4:2381,node3=172.20.0.5:2380,172.20.0.5:2381,node4=172.20.0.17:2380,172.20.0.17:2381 --storage-engine rocksdb --data-dir /usr/local/xline/data-dir --auth-public-key /mnt/public.pem --auth-private-key /mnt/private.pem --client-listen-urls=http://172.20.0.17:2379 --peer-listen-urls=http://172.20.0.17:2381,http://172.20.0.17:2380 --client-advertise-urls=http://172.20.0.17:2379 --peer-advertise-urls=http://172.20.0.17:2381,http://172.20.0.17:2380 --initial-cluster-state=existing
thread 'main' panicked at 'self_id should not be 0', /home/jiawei/Xline/crates/curp/src/members.rs:155:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

In step 4 above, xline startup fails, and the logs of the failure are as follows

Version

0.6.1 (Default)

Relevant log output

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct