bryanyang0528/ksql-python

SELECT Query splits single result into three

Closed this issue · 4 comments

I struggle realizing a relative simple example. My SELECT Query is LIMIT 1, i.e. just returns one result (I also send just one single test message to the Kafka topic. But it splits it up into three different lines in the return object.

query = client.query('SELECT * FROM CREDITCARDFRAUD_PREPROCESSED_AVRO LIMIT 1')

#print(len(list(query))) #=> shows 3

print(list(query))

The latter print returns:

['\n\n{"row":{"columns":[1547195011919,null,0,-1.3598071336738,-0.0727811733098497,2.53634673796914,1.37815522427443,-0.338320769942518,0.46', '2387777762292,0.239598554061257,0.0986979012610507,0.363786969611213,0.0907941719789316,-0.551599533260813,-0.617800855762348,-0.991389847235408,-0.311169353699879,1.46817697209427,-0.470400525259478,0.207971241929242,0.0257905801985591,0.403992960255733,0.251412098239705', ',-0.018306777944153,0.277837575558899,-0.110473910188767,0.0669280749146731,0.128539358273528,-0.189114843888824,0.133558376740387,-0.0210530534538215,149.62,"0"]},"errorMessage":null,"finalMessage":null}\n{"row":null,"errorMessage":null,"finalMessage":"Limit Reached"}\n']

Note the empty space (which is not just a space but it splits the three list items) between ,0.46', '2387777762292, and between 0.251412098239705', ',-0.018306777944153 (you can see it in the attached screenshot of my Jupyter notebook).

image

As you can also see we tried to parse this somehow to get back one result instead of three (as quick hot fix) but we did not get it working yet.

FYI, in the screenshot you see a workaround to merge the separated information (which actually should be just one record) back into one single message.

image

this is working for me:

table = client.query('select * from table')

table = ''.join(table)
table = table.replace('\n', '')
table = table.replace('}{', '},{')
table = '[' + table + ']'

for row in table: 
....
..
.

Seeing the same too, e.g.

print 'users' limit 5;

{"ROWTIME":1572428912141,"ROWKEY":"User_5","registertime":1489285984615,"userid":"User_5","regionid":"Region_6","gender":"
MALE"}
{"ROWTIME":1572428912216,"ROWKEY":"User_2","registertime":1512403158571,"userid":"User_2","regionid":"Region_7","gender":"OTHER"}
{"ROWTIME":1572428912306,"ROWKEY":"User_2","registertime":1511619652605,"userid":"User_2","regionid":"Region_9","gender":"OTHER"}

The main issue of joining on the iterator, is that it basically stops generating live results.

With #60, not seeing this issue