ncats/CTSA-Metrics

InformaticsMetricACTSQLServer.sql - Datetime compared to Date strings - decreases results erroneously

Closed this issue · 0 comments

START_DATE in OBSERVATION_FACT is datetime (timestamp) and is it compared using the between operator in several places throughout the script to DATE literals. This will not implicitly cast START_DATE to date prior to the comparison. The code as written is missing all encounters beyond midnight on Dec 31 (12/31/2020 00:00:00). You can run the following two statements and see the counts increase by a small amount. It is minimal in our warehouse, but inaccurate nonetheless.

Demo:
SELECT COUNT(DISTINCT OBS.ENCOUNTER_NUM) CNT
FROM OBSERVATION_FACT OBS WHERE OBS.START_DATE BETWEEN '01-01-2016' AND '12-31-2020'

Counts increase when explicitly converting datetime to date prior to between:
SELECT COUNT(DISTINCT OBS.ENCOUNTER_NUM) CNT
FROM OBSERVATION_FACT OBS WHERE CONVERT(DATE,OBS.START_DATE) BETWEEN '01-01-2016' AND '12-31-2020'

Below are a few references:

FROM OBSERVATION_FACT OBS WHERE OBS.START_DATE BETWEEN '2020/01/01' AND '2020/12/31'

FROM OBSERVATION_FACT OBS WHERE OBS.START_DATE BETWEEN '01-01-2016' AND '12-31-2020'

FROM OBSERVATION_FACT OBS WHERE OBS.START_DATE BETWEEN '01-01-2020' AND '12-31-2020'

Simple fix, find-replace OBS.START_DATE BETWEEN WITH CONVERT(DATE,OBS.START_DATE) BETWEEN