Misleading Date Format Pattern for CassandraSourceConnector
jp-9 opened this issue · 0 comments
Issue Guidelines
Please review these questions before submitting any issue?
What version of the Stream Reactor are you reporting this issue for?
Are you running the correct version of Kafka/Confluent for the Stream reactor release?
Yes
Do you have a supported version of the data source/sink .i.e Cassandra 3.0.9?
Yes
Have you read the docs?
Yes
What is the expected behaviour?
The CassandraDateFormatter uses the date format pattern: "yyyy-MM-dd HH:mm:ss.SSS'Z'"
class CassandraDateFormatter {
private val dateFormatPattern = "yyyy-MM-dd HH:mm:ss.SSS'Z'" // <----- Hardcoded pattern
def parse(date: String): Date = {
val dateFormatter = new SimpleDateFormat(dateFormatPattern)
dateFormatter.parse(date)
}
def format(date: Date): String = {
val dateFormatter = new SimpleDateFormat(dateFormatPattern)
dateFormatter.format(date)
}
def getYear(date: Date): Option[Int] = {
val dateFormatter = new SimpleDateFormat("yyyy");
dateFormatter.format(date).toIntOption
}
}
When setting my initial offset I want to do it in UTC time so intuitively you would something like this:
connect.cassandra.initial.offset=2022-12-22 18:00:0.000Z
<--- the Z at the end usually indicating that this is a UTC+00 date.
However the format pattern that is actually implemented is a bit misleading. The date set must end in a Z in order for it to be parsed correctly, but because the Z in the format pattern is in quotes it doesn't actually use it when determining timezone, it just requires it to be in the date string. If we want the Z at the end to indicate UTC time the format has to be "yyyy-MM-dd HH:mm:ss.SSSX" (https://docs.oracle.com/en/java/javase/12/docs/api/java.base/java/text/SimpleDateFormat.html)
An example to illustrate my point, assuming I am in UTC-05:00 (Eastern Standard Time).
According to ISO 8601 "2022-12-22 12:00:00.000Z" should be Thu Dec 22 7:00:00 EST 2022
>> new SimpleDateFormat("yyyy-MM-dd HH:mm:ss.SSS'Z'").parse("2022-12-22 12:00:00.000Z")
Output:
❌ Thu Dec 22 12:00:00 EST 2022
>> new SimpleDateFormat("yyyy-MM-dd HH:mm:ss.SSSX").parse("2022-12-22 12:00:00.000Z")
Output:
✔ Thu Dec 22 7:00:00 EST 2022
Was this design intentional? Is there a way to set the initial offset in Zulu Time?
** Edit: Accidentally included some of my test code in the CassandraDateFormatter
copy