Orange-OpenSource/casskop

[RFE] Provide better support for adding data centers

Opened this issue · 4 comments

It is already possibly to add a data center to an existing CassandraCluster object. CassKop will deploy new Cassandra nodes. While this is obviously a necessary step for adding a data center, it is not sufficient.

Some steps need to be done by client applications, which CassKop will not manage. See #85 for discussion on notifying client applications of topology changes.

For now, let's assume client applications already use local consistency levels (e.g., LOCAL_ONE) for queries and that NetworkTopologyStrategy is used for replication.

Here is a very brief summary of what should happen:

  • Disable any schedule repairs
  • Deploy C* nodes in new DC
  • Update cassandra-rackdc.properties
  • Modify replication settings to include new DC
    • If cassandra auth is used, then replication for the system_auth keyspace will have to be updated as well
  • Wait for all nodes in new DC to reach the U/N state as reported by nodetool status
  • Run rebuild on each node in new DC.
    • Do not run rebuild on all nodes concurrently. Run build on at most two nodes concurrently to be safe.
  • Once all rebuilds finish make sure that all nodes in new DC are up
  • Re-enable repairs
  • Notify clients of the new DC

For the sake of discussion, let's say we have Reaper for repairs and it is managed by an operator. Using the notification mechanism discussed in #85, the Reaper operator can receive notifications for disabling and enabling repairs.

@jsanda we don't have any cassandra-rackdc.properties as we use GPFS snitch. Applications that use the native protocol can also subscribe to notifications liker schema change, cluster changes etc..

I've also to add that we tried to keep any Cassandra library out of the operator for now to not depend on any version. We would have to check if those changes can be done via Jolokia calls.

In another ticket we talked about having events pushed onto a bus but that would mean not have to wait back. Seems here we would have to talk to the repair tool (cassandra-reaper in our case) and use its protocol

we don't have any cassandra-rackdc.properties as we use GPFS snitch.

GossipingPropertyFileSnitch uses the cassandra-rackdc.properties file. If the file is not found start up with fail with an error like this:

ERROR [main] 2019-08-04 13:24:00,499 CassandraDaemon.java:749 - Exception encountered during startup
org.apache.cassandra.exceptions.ConfigurationException: Error instantiating snitch class 'org.apache.cassandra.locator.GossipingPropertyFileSnitch'.
        at org.apache.cassandra.utils.FBUtilities.construct(FBUtilities.java:596) ~[apache-cassandra-3.11.4.jar:3.11.4]
        at org.apache.cassandra.utils.FBUtilities.construct(FBUtilities.java:574) ~[apache-cassandra-3.11.4.jar:3.11.4]
        at org.apache.cassandra.config.DatabaseDescriptor.createEndpointSnitch(DatabaseDescriptor.java:1045) ~[apache-cassandra-3.11.4.jar:3.11.4]
        at org.apache.cassandra.config.DatabaseDescriptor.applySnitch(DatabaseDescriptor.java:969) ~[apache-cassandra-3.11.4.jar:3.11.4]
        at org.apache.cassandra.config.DatabaseDescriptor.applyAll(DatabaseDescriptor.java:324) ~[apache-cassandra-3.11.4.jar:3.11.4]
        at org.apache.cassandra.config.DatabaseDescriptor.daemonInitialization(DatabaseDescriptor.java:148) ~[apache-cassandra-3.11.4.jar:3.11.4]
        at org.apache.cassandra.config.DatabaseDescriptor.daemonInitialization(DatabaseDescriptor.java:132) ~[apache-cassandra-3.11.4.jar:3.11.4]
        at org.apache.cassandra.service.CassandraDaemon.applyConfig(CassandraDaemon.java:665) [apache-cassandra-3.11.4.jar:3.11.4]
        at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:609) [apache-cassandra-3.11.4.jar:3.11.4]
        at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:732) [apache-cassandra-3.11.4.jar:3.11.4]
Caused by: org.apache.cassandra.exceptions.ConfigurationException: DC or rack not found in snitch properties, check your configuration in: cassandra-rackdc.properties

Applications that use the native protocol can also subscribe to notifications liker schema change, cluster changes etc..

Yes that is true; however, I am not referring to client applications here. And by client I mean a client of the Cassandra cluster. Reaper is just one access and maybe not the best. Reaper will require JMX access, but it does not require access over the native protocol. A better example would be some other service, maybe even part of a CI pipeline, who will deploy a Cassandra client application when the new DC is ready.

I've also to add that we tried to keep any Cassandra library out of the operator for now to not depend on any version. We would have to check if those changes can be done via Jolokia calls.

I don't want to side track the discussion in this ticket, but I offer this. I think it is reasonable to require a minimum version 3.11.4 for the operator unless and until there are concrete upgrade use cases that need to be supported; otherwise can make both the design and implementation of the operator unnecessarily complex.

In another ticket we talked about having events pushed onto a bus but that would mean not have to wait back. Seems here we would have to talk to the repair tool (cassandra-reaper in our case) and use its protocol

Can you please provide a link to the ticket? I am certainly not advocating have an event bus being a hard dependency of the operator. I think it would be good consider using open standards, like cloud events. I think a great starting point would be having the ability to specify an http endpoint as part of a CassandraCluster. The operator would then send notifications to that endpoint for interesting events. This would minimize external dependencies for the operator while still providing plenty of flexibility.

I don't want to get too far ahead of myself, but I would interested in prototyping a knative event source that makes use of the http endpoint functionality I described.

GossipingPropertyFileSnitch uses the cassandra-rackdc.properties file. If the file is not found start up with fail with an error like this:

nvm, don't know why but I thought I read cassandra-topology.properties file, the one used by PropertyFileSnitch. Yes this file is needed but will never be updated as it contains the information for the local node only and a node will never be moved to another rack.

Can you please provide a link to the ticket?

That's the one you link the current one to.

I went back and read my initial comment. Not sure why I had updating cassandra-rackdc.properties as a step. The rest of the steps are still applicable, and some of them, like disabling repairs and changing replication, will need to be performed by other actors.

Implementing some form of notifications and/or workflow and performing the rebuilds would make the use case of adding a DC a lot more robust.

Note that updating the replication and performing rebuilds are necessary steps for adding a new DC. We do not want to start doing rebuilds until the replication has been updated.