RedisGraph/redisgraph-bulk-loader

Strings forcibly cast to int/float when inserting ARRAY type

benmirtchouk opened this issue · 1 comments

Hello,

I have run into an issue with bulk inserting edges into our graph where any strings that "look like" ints/floats are automatically cast. As far as I can see, there is no way to avoid this conversion which is incompatible with our desired output.

For example, the following CSV (edges.csv):

":START_ID(Global)"`":END_ID(Global)"`"arr_prop:ARRAY"`"comment:STRING"
"1"`"2"`"['0.6.7.8', '0.1', '0.2', '9']"`"test edge"

Loaded via

redisgraph-bulk-insert \
    --unix-socket-path /test_socket.sock \
    --enforce-schema \
    --separator \` \
    --nodes-with-label LHS lhs_nodes.csv \
    --nodes-with-label RHS rhs_nodes.csv \
    --relations-with-type E edges.csv -- test_graph

Results in e.arr_prop for this edge returning ['0.6.7.8', 0.1, 0.2, 9]. The last 3 elements are float/float/int rather than strings. It doesn't seem like any extra quoting/escaping helps in the CSV to avoid this issue.

In our case, these strings are period-separated lists of integers so we need them to remain strings, regardless of how many tokens are in the list. (i.e. we would like the output to instead be ['0.6.7.8', '0.1', '0.2', '9'].) Perhaps an additional type ARRAY_STRING can be added to support this use case?

Thanks in advance!

ARRAY_STRING, ARRAY_INT, etc.
Makes sense.