ing-bank/cassandra-jdbc-wrapper

Handle multiple host:port in the JDBC connection string

maximevw opened this issue · 8 comments

Discussed in #40

Originally posted by stefanofornari November 13, 2023
Hello, it is mentioned several time the connection string is something like:

jdbc:cassandra://host1--host2--host3:9042/keyspace

What if nodes have different ports? can the below be used?

jdbc:cassandra://host1:port1--host2:port2--host3:port3/keyspace

Do you think it is a candidate for next release? If you give me some directions I can maybe help (integration test aside :) )

@stefanofornari I think it's possible to integrate this for the next release (I hope it'll be ready for end of November/beginning of December).

I think there is an important (and maybe a bit touchy) refactoring to do in JdbcUrlUtil.parseUrl(). Inserting ports for each host certainly make the URI invalid. So, it requires to be very careful to validate the JDBC URL properly.
Edit: I think we can do something smart enough by using URI.getAuthority() method, splitting the result with -- then build a new property TAG_CONTACT_POINTS which is a list of strings host:port re-usable in SessionHolder... just an idea to explore.

Then, this part of SessionHolder has to be re-written.

I didn't go further in the analysis for now.

sualeh commented

@maximevw Suggestion:

Instead of:

jdbc:cassandra://host1:port1--host2:port2--host3:port3/keyspace

how about keeping the host:port convention standard, and allowing the other hosts to be specified in connection properties - for example:

jdbc:cassandra://host1:port1/keyspace?hosts=host2:port2--host3:port3

Alternatively, on the line of @sualeh's comment:

jdbc:cassandra://host1:port1;host2:port2;host3:port3.../keyspace

Thank you both @stefanofornari and @sualeh for your sugggestions.

I prefer the syntax originally suggested: jdbc:cassandra://host1:port1--host2:port2.../keyspace.

Why?

For now, keeping the separator "--" originally used will avoid to introduce a breaking change for users already using the syntax "host1--host2:port". Using "," or ";" as separator could be a good idea too, but I suggest to keep the current syntax, at least for short term.

Separating the additional contact points in a dedicated parameter could be a solution, but it splits the information in two places and IMHO, it makes it less readable. Moreover, it doesn't make sense to separate a particular contact point from the others (given how Cassandra works). I prefer to keep them together. Also, I remembered, for example, the syntax used by PostgreSQL JDBC driver for fail-over is: "host:port, host:port, ..." (see connection fail-over documentation). Note, other drivers do the same, like multiple hosts with MySQL. Consequently, I think it's better to keep all the hosts at the same place in the URL.

sualeh commented

@maximevw Thanks for the links. There seems to be some precedent for doing it the way you suggest that I was not aware of. I guess we should think of this as a connection string, rather than as a "connection URL" as JDBC tries to think of it.

Implemented as discussed, by commit 553010a.
Documentation updated here with examples of valid JDBC URLs: JDBC wrapper usage
This feature will be available in the next release (4.11.0).

Thanks again @stefanofornari for pointing out this particular case.

Great job! Well, thanks to you for such good piece of software and to promptly address issues!