harelba/q

ifnull doesn't work for me

pwright opened this issue · 4 comments

select ifnull(column,0) from

doesn't work for me, never sees the null,
but I wonder if it's related to format, I'm using -d ,

finp, I may be wrong, but so far I don't think q supports NULL. It looks like it reports it as an empty string. (I'm not sure what happens to fields which look like integers, I didn't think of trying that.)

I see this line suggests integer nulls are supported https://github.com/harelba/q/blob/master/test/test-suite#L1530
But this seems like a restriction (whatever it is) worth documenting?

bitti commented

I guess it's more a limitation of the CSV "standard" than of q? I don't think there is a common way of telling if a string value is supposed to be null or an actual empty string. But you certainly can convert empty strings to null explicitly (i.e. ifnull(nullif(column, ''), 0))

This subject is indirectly related to csv quoting and to q's type inference.

Since csv fields are many times unquoted, the input can be ambiguous. For example, in an unquoted csv, an empty string value can be either null or an empty string. If the type is a number, then an empty value can be analyzed as a null value without any ambiguity.

q performs type inference, and allows by default working with semi or none-quoted csv files. This perhaps makes reasoning about the expected behavior a bit more complicated.

Making the input file fully quoted can obviously make it easier, but many times it's not practical, since the input file is created by some external system. When this is the case, then it's important to understand the types that q inferred from the actual data at hand (using the -A flag). Note that the type inference is based on the actual data in the input file, so there might be variations in behavior for different data.

In addition, making q fully quote the output (-W all) can help with reasoning about the results, and can produce a more consistent result that can be fed into other tools or back into q itself.

I'd be glad to provide examples to make it clearer if needed.

Regardless, please send me concrete examples where this breaks and prevents proper reasoning and analysis, and i'll try to help if possible.