mithrandie/csvq

CASE WHEN issue in 1.17.3

ondohotola opened this issue · 8 comments

Given the file p.csv :

A,B
1,2
2,3
4,5
0,0

the command in CSVQ

SELECT CASE WHEN A=0 THEN A=9 ELSE A END FROM p

results in

%!s(<nil>): cannot evaluate as a value

Can you reproduce and if so advise, please?

greetings, el

What is the result when you run query

SELECT CASE WHEN A=0 THEN 9 ELSE A END AS A FROM p

?

I haven't yet tested this query.

In SQLite, the query

DROP TABLE IF EXISTS test;
CREATE TABLE test(A INT, B INT);
INSERT INTO test VALUES (1,2),(2,3),(4,5),(0,0);
SELECT CASE WHEN A=0 THEN A=9 ELSE A END AS A FROM test;

results in

A
1
2
4
0

However, the query

DROP TABLE IF EXISTS test;
CREATE TABLE test(A INT, B INT);
INSERT INTO test VALUES (1,2),(2,3),(4,5),(0,0);
SELECT CASE WHEN A=0 THEN 9 ELSE A END AS A FROM test;

results in

A
1
2
4
9

I'm surprised that SQLite doesn't report an for the first query.

This bug occurs when using CASE WHEN expression and has been present since version 1.17.1.

CASE value WHEN… works without errors.

Thank you for reporting.

This bug has been fixed in version 1.17.5.

$ csvq -v
csvq version 1.17.5
$ csvq 'SELECT CASE WHEN A=0 THEN A=9 ELSE A END FROM p'
+---------------------------------------+
| CASE WHEN A = 0 THEN A = 9 ELSE A END |
+---------------------------------------+
| 1                                     |
| 2                                     |
| 4                                     |
|                 FALSE                 |
+---------------------------------------+
$ csvq -v
csvq version 1.17.5
$ csvq 'SELECT CASE WHEN A=0 THEN A=9 ELSE A END FROM p'
+---------------------------------------+
| CASE WHEN A = 0 THEN A = 9 ELSE A END |
+---------------------------------------+
| 1                                     |
| 2                                     |
| 4                                     |
|                 FALSE                 |
+---------------------------------------+

Are you sure that the last row shouldn't return 0? As it stands, the column is mixing integer and boolean values. For comparison, SQLite coerces the result of A = 9 to an integer (0), while PostgreSQL reports an error because the expression results in a type mismatch in the column. Like SQLite, I think csvq is less strict about data types, but in general I think it makes sense to preserve type consistency in all rows. What does the SQL standard require or recommend?