Unsupported array element type: __array__
sbarkar opened this issue · 3 comments
Hello,
First of all, thank you very much for creating this. Looks like it's saved me heaps of time already.
I have, however, received an error, which seems to be related to the format of the json file. Specifically relating to the file having an array as one of the nested elements.
Example error message:
INFO:root:Problem on line 4: Unsupported array element type: __array__
This repeats for almost all rows of the file.
Row 4 of the file looks like this:
{"op":"mcm","clk":"1304450546","pt":1585613976590,"mc":[{"id":"1.170258437","rc":[{"batl":[[0,2.66,2.53],[1,1000,2.2]],"ltp":0.0,"tv":0.0,"id":110503}]}]}
Questions:
- Is this a limitation of the generator or limitation of BQ in general?
- If it is a limitation of the generator - do you have any ideas on how it can be fixed? I'm willing to contribute with a bit of guidance.
- If it is a limitation of BQ in general - what do you think could be the workaround here? To normalise per array element?
It looks like you have an array of arrays in the batl
field:
$ jq < issue_69.data.json
{
"op": "mcm",
"clk": "1304450546",
"pt": 1585613976590,
"mc": [
{
"id": "1.170258437",
"rc": [
{
"batl": [
[
0,
2.66,
2.53
],
[
1,
1000,
2.2
]
],
"ltp": 0,
"tv": 0,
"id": 110503
}
]
}
]
}
I guess generate-schema
does not support that, which probably makes sense because I've never had to use a schema like that.
The question is, does BigQuery support it? I don't know. I recommend trying to import this using bq load --autodetect
. If you find that BQ supports it, and would like to add that feature to bigquery-schema-generator
, I'm open to a PR. Just make sure your include the appropriate unit test cases.
Ok. Got an update here. BigQuery does not support array within an array elements unfortunately so consider this issue closed. Thank you for the prompt responses.
Cool, thanks for the follow up. From my fuzzy recollection, I think the reason that "array of array" is not supported is because protobufs don't support that, and everything in BQ is stored as protobufs.
If you can transform your batl
field into an "array of record containing an array", I think BQ will support that.