Merged Cell Headers Inconsistency
noorbuchi opened this issue · 0 comments
noorbuchi commented
Describe the bug
When retrieving data from google sheets that contain merged cells as headers, it creates an inconsistent number of headers and columns.
To Reproduce
Using the following data:
And the following configuration:
source_id: 1jMbGVHjXs-lQbh5pstplrCOo5f76C_Nj2SOyL-bsZsQ
sheets:
- name: Sheet1
regions:
- name: lab1
start: A1
end: E12
contains_headers: true
This error is produced:
Traceback (most recent call last):
File "/home/noboshe/.cache/pypoetry/virtualenvs/sheetshuttle-PeNuXww9-py3.9/lib/python3.9/site-packages/pandas/core/internals/construction.py", line 982, in _finalize_columns_and_data
columns = _validate_or_indexify_columns(contents, columns)
File "/home/noboshe/.cache/pypoetry/virtualenvs/sheetshuttle-PeNuXww9-py3.9/lib/python3.9/site-packages/pandas/core/internals/construction.py", line 1030, in _validate_or_indexify_columns
raise AssertionError(
AssertionError: 2 columns passed, passed data had 5 columns
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/noboshe/.cache/pypoetry/virtualenvs/sheetshuttle-PeNuXww9-py3.9/lib/python3.9/site-packages/typer/main.py", line 214, in __call__
return get_command(self)(*args, **kwargs)
File "/home/noboshe/.cache/pypoetry/virtualenvs/sheetshuttle-PeNuXww9-py3.9/lib/python3.9/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/home/noboshe/.cache/pypoetry/virtualenvs/sheetshuttle-PeNuXww9-py3.9/lib/python3.9/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/home/noboshe/.cache/pypoetry/virtualenvs/sheetshuttle-PeNuXww9-py3.9/lib/python3.9/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/noboshe/.cache/pypoetry/virtualenvs/sheetshuttle-PeNuXww9-py3.9/lib/python3.9/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/home/noboshe/.cache/pypoetry/virtualenvs/sheetshuttle-PeNuXww9-py3.9/lib/python3.9/site-packages/typer/main.py", line 497, in wrapper
return callback(**use_params) # type: ignore
File "/home/noboshe/SheetShuttle/SheetShuttle/sheetshuttle/main.py", line 52, in sheetshuttle
my_plugin.run(sheets_keys_file, sheets_config_directory)
File "/home/noboshe/SheetShuttle/SheetShuttle/../sample_plugin.py", line 8, in run
my_collector.collect_files()
File "/home/noboshe/SheetShuttle/SheetShuttle/sheetshuttle/sheet_collector.py", line 135, in collect_files
sheet_obj.collect_regions()
File "/home/noboshe/SheetShuttle/SheetShuttle/sheetshuttle/sheet_collector.py", line 204, in collect_regions
data = Sheet.to_dataframe(region_data)
File "/home/noboshe/SheetShuttle/SheetShuttle/sheetshuttle/sheet_collector.py", line 270, in to_dataframe
return pd.DataFrame(data[1:], columns=data[0])
File "/home/noboshe/.cache/pypoetry/virtualenvs/sheetshuttle-PeNuXww9-py3.9/lib/python3.9/site-packages/pandas/core/frame.py", line 721, in __init__
arrays, columns, index = nested_data_to_arrays(
File "/home/noboshe/.cache/pypoetry/virtualenvs/sheetshuttle-PeNuXww9-py3.9/lib/python3.9/site-packages/pandas/core/internals/construction.py", line 519, in nested_data_to_arrays
arrays, columns = to_arrays(data, columns, dtype=dtype)
File "/home/noboshe/.cache/pypoetry/virtualenvs/sheetshuttle-PeNuXww9-py3.9/lib/python3.9/site-packages/pandas/core/internals/construction.py", line 883, in to_arrays
content, columns = _finalize_columns_and_data(arr, columns, dtype)
File "/home/noboshe/.cache/pypoetry/virtualenvs/sheetshuttle-PeNuXww9-py3.9/lib/python3.9/site-packages/pandas/core/internals/construction.py", line 985, in _finalize_columns_and_data
raise ValueError(err) from err
ValueError: 2 columns passed, passed data had 5 columns