nats-io/nats-streaming-server

Can a bad state NATS cluster be bootstrapped with new config?

mnjdhl opened this issue · 5 comments

My NATS cluster of 3 nodes is in bad state. Previously it was a 6-node cluster and recently it has been trimmed down to 3-node. The meta.json still shows the 6 members. The leader election fails as it expects 4 votes but gets only 3 votes, though other 3 staled nodes don't have NATS running. All the 3 active members are either in Follower or Candidate state. So attempting to remove the 3 staled nodes does not work.

Is there anyway to bootstrap this new cluster with the newer configuration?

In order to help you better, would you please clarify if you are talking about NATS Streaming or JetStream? There is no meta.json in NATS Streaming...

But I believe that the solution would be quite similar: you need to start a 4th node to allow election (and then deletion of the old nodes).

I am talking about "NATS Streaming" and I have found the existence of meta.json in it.

Again, not sure what that file is. Could you display the content of it? I do not believe that NATS Streaming is creating this file.

The following is the content of meta.json (located at log/442e19d3-faca-41e5-a32c-803a72d0e4ab/snapshots/12926-115914-1666846843907/ ):

{
  "Version": 1,
  "ID": "12926-115914-1666846843907",
  "Index": 115914,
  "Term": 12926,
  "Peers": "ltoAbjQ0MmUxOWQzLWZhY2EtNDFlNS1hMzJjLTgwM2E3MmQwZTRhYi40NDJlMTlkMy1mYWNhLTQxZTUtYTMyYy04MDNhNzJkMGU0YWIuNDQyZTE5ZDMtZmFjYS00MWU1LWEzMmMtODAzYTcyZDBlNGFi2gBuNDQyZTE5ZDMtZmFjYS00MWU1LWEzMmMtODAzYTcyZDBlNGFiLmRlOWJjZTk3LWE4MGItNDBhNy04NWRmLTk1ZmI4YWYxNDAzYy40NDJlMTlkMy1mYWNhLTQxZTUtYTMyYy04MDNhNzJkMGU0YWLaAG40NDJlMTlkMy1mYWNhLTQxZTUtYTMyYy04MDNhNzJkMGU0YWIuZWE2NzY2ZmYtMTY3NC00MTI3LTkxZGQtOTk0YmFlNjgwNWFiLjQ0MmUxOWQzLWZhY2EtNDFlNS1hMzJjLTgwM2E3MmQwZTRhYtoAbjQ0MmUxOWQzLWZhY2EtNDFlNS1hMzJjLTgwM2E3MmQwZTRhYi5jZjEwNjgzZi1mOGY0LTRiODctOTZiYi00NjM2ZDk5MTRjNWMuNDQyZTE5ZDMtZmFjYS00MWU1LWEzMmMtODAzYTcyZDBlNGFi2gBuNDQyZTE5ZDMtZmFjYS00MWU1LWEzMmMtODAzYTcyZDBlNGFiLjRkN2FjMjdhLTRhZmQtNDM2My1iM2VhLWJlM2I2NThlZjRkZS40NDJlMTlkMy1mYWNhLTQxZTUtYTMyYy04MDNhNzJkMGU0YWLaAG40NDJlMTlkMy1mYWNhLTQxZTUtYTMyYy04MDNhNzJkMGU0YWIuNDhiODdjNGEtYTA4Yy00YTkyLTk2YjQtMWU3MzZmNzIzNjk3LjQ0MmUxOWQzLWZhY2EtNDFlNS1hMzJjLTgwM2E3MmQwZTRhYg==",
  "Configuration": {
    "Servers": [
      {
        "Suffrage": 0,
        "ID": "442e19d3-faca-41e5-a32c-803a72d0e4ab",
        "Address": "442e19d3-faca-41e5-a32c-803a72d0e4ab.442e19d3-faca-41e5-a32c-803a72d0e4ab.442e19d3-faca-41e5-a32c-803a72d0e4ab"
      },
      {
        "Suffrage": 0,
        "ID": "de9bce97-a80b-40a7-85df-95fb8af1403c",
        "Address": "442e19d3-faca-41e5-a32c-803a72d0e4ab.de9bce97-a80b-40a7-85df-95fb8af1403c.442e19d3-faca-41e5-a32c-803a72d0e4ab"
      },
      {
        "Suffrage": 0,
        "ID": "ea6766ff-1674-4127-91dd-994bae6805ab",
        "Address": "442e19d3-faca-41e5-a32c-803a72d0e4ab.ea6766ff-1674-4127-91dd-994bae6805ab.442e19d3-faca-41e5-a32c-803a72d0e4ab"
      },
      {
        "Suffrage": 0,
        "ID": "cf10683f-f8f4-4b87-96bb-4636d9914c5c",
        "Address": "442e19d3-faca-41e5-a32c-803a72d0e4ab.cf10683f-f8f4-4b87-96bb-4636d9914c5c.442e19d3-faca-41e5-a32c-803a72d0e4ab"
      },
      {
        "Suffrage": 0,
        "ID": "4d7ac27a-4afd-4363-b3ea-be3b658ef4de",
        "Address": "442e19d3-faca-41e5-a32c-803a72d0e4ab.4d7ac27a-4afd-4363-b3ea-be3b658ef4de.442e19d3-faca-41e5-a32c-803a72d0e4ab"
      },
      {
        "Suffrage": 0,
        "ID": "48b87c4a-a08c-4a92-96b4-1e736f723697",
        "Address": "442e19d3-faca-41e5-a32c-803a72d0e4ab.48b87c4a-a08c-4a92-96b4-1e736f723697.442e19d3-faca-41e5-a32c-803a72d0e4ab"
      }
    ]
  },
  "ConfigurationIndex": 3191,
  "Size": 66540,
  "CRC": "DwoakjxGDMM="
}

I see. Well, my previous answer stands. Try to start an extra node so that quorum can be reached and a leader be elected.