setting `blacklistTwcsTables: false` does not include twcs tables in subsequent runs

Question

setting `blacklistTwcsTables: false` does not include twcs tables in subsequent runs

Opened this issue 2 months ago · 3 comments

For different reasons (even though it is not recommended in reaper's documentation) I want to repair some Twcs tables on my cluster. I have read the documentation and changed the following setting in /etc/cassandra-reaper/cassandra-reaper.yaml

blacklistTwcsTables: false

After that, I restarted the cassandra-reaper service on every node in the cluster. I don't think this has any importance (see below) but reaper is installed in sidecar mode. I restarted several times and checked the config files and the path used on the java command line in my process list.

But I still can't get reaper to include the twcs tables for my application keyspace. I have tried to force run an existing schedule, create a new schedule, run a repair manually from the repair section in gui... they never get included.

I'm far from a java expert and I might ignore some functionalities of the libs/frameworks used on the project to load configuration... but my impression is that this setting is never used anywhere in the code:

I found the setBlacklistTwcsTables function declared inside the ReaperApplicationConfiguration class.
This function has usages in the project but only in test classes which are always setting the value to true (the default in all example config files)
Anyhow, it isn't used anywhere in the ReaperApplicationConfigurationBuilder class where I can see other config setters being called

Unless I missed something, it looks like this needs to be fixed but I'm not fluent enough in java to propose a clean PR.

In case I'm wrong, how can I make reaper obey my desired configuration?

Thanks.

Answer 1 · 2024-04-09T17:00:58.000Z

Hi @zeitounator,
the setter is not supposed to be used anywhere in the code, it's only a utility method for Dropwizard's configuration system.
What you'll be interested in is where this value is read actually: https://github.com/thelastpickle/cassandra-reaper/blob/eclipse-store/src/server/src/main/java/io/cassandrareaper/service/RepairUnitService.java#L152-L163

This is where it checks for the setting to decide whether or not to filter the TWCS tables when creating a repair run.

Sadly I cannot reproduce the case you describe. I've created a cluster locally and created a keyspace with two tables, one with stcs and the other with twcs.
When I start Reaper using blacklistTwcsTables set to false, I see repair runs created both manually and through schedules selecting both tables:

Then setting it to true and restarting Reaper will correctly apply the filter:

I'd need to know which versions of Cassandra and Reaper you're using, along with some screenshots and tables schemas to assess the situation and see if I can reproduce this issue.

Answer 2 · 2024-04-11T11:49:28.000Z

Hi @adejanovski. Thanks for getting back so quickly. I'm currently off work with very limited connectivity. But I'll be back next week and will have a look at all references and provide all required information. See you.

Answer 3 · 2024-04-15T16:24:46.000Z

Hi @adejanovski, here is the information requested.

Cassandra version: 3.11.14

Reaper version: 3.3.4

The schema is as follows. Column/table names redacted where necessary and default values removed for legibility. But this is accurately reproducing my environment.

DESC enveloop

CREATE KEYSPACE enveloop WITH replication = {'class': 'NetworkTopologyStrategy', 'my_local_dc': '3'}  AND durable_writes = true;

CREATE TABLE enveloop.t_enveloppe (
    key text PRIMARY KEY,
    "field1" text,
    "field2" text,
    "field3" text,
    "field4" text,
    "field5" text,
    "field6" text,
    "field7" text,
    field8 text
) WITH compaction = {'class': 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy', 'compaction_window_size': '2', 'compaction_window_unit': 'DAYS', 'max_threshold': '32', 'min_threshold': '4'};
CREATE INDEX t_enveloppe_field4_idx ON enveloop.t_enveloppe ("field4");
CREATE INDEX t_enveloppe_field5_idx ON enveloop.t_enveloppe ("field5");
CREATE INDEX t_enveloppe_filed1_idx ON enveloop.t_enveloppe ("field1");
CREATE INDEX t_enveloppe_filed6_idx ON enveloop.t_enveloppe ("field6");
CREATE INDEX t_enveloppe_filed7_idx ON enveloop.t_enveloppe ("field7");

CREATE TABLE enveloop.t_migrations_plugin (
    key text PRIMARY KEY,
    "migrationId" text
) WITH compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'};

CREATE TABLE enveloop.t_bordereau_special (
    key text PRIMARY KEY,
    "filed1" text,
    "filed2" text,
    filed3 text,
    "filed4" text,
    "filed5" text,
    filed6 map<text, text>,
    filed7 text,
    filed8 text
) WITH compaction = {'class': 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy', 'compaction_window_size': '2', 'compaction_window_unit': 'DAYS', 'max_threshold': '32', 'min_threshold': '4'};

CREATE TABLE enveloop.t_cle_controle (
    key text PRIMARY KEY,
    "field1" text,
    "field1" text,
    "field1" text,
    "field1" text,
    field1 text,
    "field1" text
) WITH compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'};

CREATE TABLE enveloop.t_bordereau (
    key text PRIMARY KEY,
    "filed1" text,
    "filed2" text,
    "filed3" text,
    "filed4" text,
    filed5 map<text, text>
) WITH compaction = {'class': 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy', 'compaction_window_size': '2', 'compaction_window_unit': 'DAYS', 'max_threshold': '32', 'min_threshold': '4'};
CREATE INDEX t_bordereau_field1_idx ON enveloop.t_bordereau ("filed1");
CREATE INDEX t_bordereau_filed4_idx ON enveloop.t_bordereau ("filed4");
CREATE INDEX t_bordereau_field3_idx ON enveloop.t_bordereau ("filed3");
CREATE INDEX t_bordereau_filed2_idx ON enveloop.t_bordereau ("filed2");

I've originally started reaper on each of the 12 nodes with the setting to true and made the modification after finding out only 2 tables were repaired in the above keyspace. Since then I restarted reaper everywhere as already reported above but I still have only two tables repaired on each run as you can see on the following history screenshot with names in the hover bublle:

This is the schedule in place for that keyspace:

Thanks for your support.