amirziai/flatten

flatten_preserve_lists is failed to flatten json data which includes an array with more than 10 rows

Opened this issue · 1 comments

I have a json data, the orderItems has more than 10 items. it looks like:
{
"orderHeader":{
"orderEvent":"STATUS_UPDATE_IN",
"orderNumber":"099432727"
},
"orderItems":[
{
"itemLineNumber":1,
"itemSku":3822,
"itemQuantity":468
},
{
"itemLineNumber":2,
"itemSku":8805,
"itemQuantity":414
},
{
"itemLineNumber":3,
"itemSku":10045,
"itemQuantity":24
},
{
"itemLineNumber":4,
"itemSku":10150,
"itemQuantity":24
},
{
"itemLineNumber":5,
"itemSku":10212,
"itemQuantity":36
},
{
"itemLineNumber":6,
"itemSku":10218,
"itemQuantity":24
},
{
"itemLineNumber":7,
"itemSku":10224,
"itemQuantity":84
},
{
"itemLineNumber":8,
"itemSku":10226,
"itemQuantity":60
},
{
"itemLineNumber":9,
"itemSku":10227,
"itemQuantity":42
},
{
"itemLineNumber":10,
"itemSku":10242,
"itemQuantity":84
},
{
"itemLineNumber":11,
"itemSku":10444,
"itemQuantity":12
},
{
"itemLineNumber":12,
"itemSku":10507,
"itemQuantity":12
},
{
"itemLineNumber":13,
"itemSku":10583,
"itemQuantity":6
},
{
"itemLineNumber":14,
"itemSku":11661,
"itemQuantity":396
},
{
"itemLineNumber":15,
"itemSku":11693,
"itemQuantity":48
},
{
"itemLineNumber":16,
"itemSku":11776,
"itemQuantity":24
},
{
"itemLineNumber":17,
"itemSku":11811,
"itemQuantity":42
},
{
"itemLineNumber":18,
"itemSku":11927,
"itemQuantity":24
},
{
"itemLineNumber":19,
"itemSku":12195,
"itemQuantity":732
},
{
"itemLineNumber":20,
"itemSku":12334,
"itemQuantity":24
}
]
}

rows = flatten_preserve_lists(json_data, separator='.', max_depth=5, max_list_index=100)
len(rows)
Out[13]: 11

after flatten, it should be 20 flatted rows. I did some troubleshooting and found that there is a bug in your code.

  global_max_record = int(max(list(
      list_prebuilt_flattened_dict.keys())))

global_max_record is always 9 once list_prebuilt_flattened_dict.keys reach '10'. So you cannot generate more than 11 rows.
keys() are '0','1','2','3','4','5','6','7','8','9','10'. max() function always get '9' among the keys because it get the lexicographically-largest value from the list.

i changed it to:
global_max_record = int(max(list(
list_prebuilt_flattened_dict.keys()), key=int))
then it works.

rows = flatten_preserve_lists(json_data, separator='.', max_depth=5, max_list_index=100)
len(rows)
Out[16]: 20

thanks for reporting. feel free to create a PR.