GatorEducator/SheetShuttle

Merged Cell Headers Inconsistency

noorbuchi opened this issue · 0 comments

Describe the bug
When retrieving data from google sheets that contain merged cells as headers, it creates an inconsistent number of headers and columns.

To Reproduce
Using the following data:

Screenshot from 2022-02-21 23-26-18

And the following configuration:

source_id: 1jMbGVHjXs-lQbh5pstplrCOo5f76C_Nj2SOyL-bsZsQ
sheets:
    - name: Sheet1
      regions:
      - name: lab1
        start: A1
        end: E12
        contains_headers: true

This error is produced:

Traceback (most recent call last):
  File "/home/noboshe/.cache/pypoetry/virtualenvs/sheetshuttle-PeNuXww9-py3.9/lib/python3.9/site-packages/pandas/core/internals/construction.py", line 982, in _finalize_columns_and_data
    columns = _validate_or_indexify_columns(contents, columns)
  File "/home/noboshe/.cache/pypoetry/virtualenvs/sheetshuttle-PeNuXww9-py3.9/lib/python3.9/site-packages/pandas/core/internals/construction.py", line 1030, in _validate_or_indexify_columns
    raise AssertionError(
AssertionError: 2 columns passed, passed data had 5 columns

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/noboshe/.cache/pypoetry/virtualenvs/sheetshuttle-PeNuXww9-py3.9/lib/python3.9/site-packages/typer/main.py", line 214, in __call__
    return get_command(self)(*args, **kwargs)
  File "/home/noboshe/.cache/pypoetry/virtualenvs/sheetshuttle-PeNuXww9-py3.9/lib/python3.9/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/home/noboshe/.cache/pypoetry/virtualenvs/sheetshuttle-PeNuXww9-py3.9/lib/python3.9/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/home/noboshe/.cache/pypoetry/virtualenvs/sheetshuttle-PeNuXww9-py3.9/lib/python3.9/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/noboshe/.cache/pypoetry/virtualenvs/sheetshuttle-PeNuXww9-py3.9/lib/python3.9/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/home/noboshe/.cache/pypoetry/virtualenvs/sheetshuttle-PeNuXww9-py3.9/lib/python3.9/site-packages/typer/main.py", line 497, in wrapper
    return callback(**use_params)  # type: ignore
  File "/home/noboshe/SheetShuttle/SheetShuttle/sheetshuttle/main.py", line 52, in sheetshuttle
    my_plugin.run(sheets_keys_file, sheets_config_directory)
  File "/home/noboshe/SheetShuttle/SheetShuttle/../sample_plugin.py", line 8, in run
    my_collector.collect_files()
  File "/home/noboshe/SheetShuttle/SheetShuttle/sheetshuttle/sheet_collector.py", line 135, in collect_files
    sheet_obj.collect_regions()
  File "/home/noboshe/SheetShuttle/SheetShuttle/sheetshuttle/sheet_collector.py", line 204, in collect_regions
    data = Sheet.to_dataframe(region_data)
  File "/home/noboshe/SheetShuttle/SheetShuttle/sheetshuttle/sheet_collector.py", line 270, in to_dataframe
    return pd.DataFrame(data[1:], columns=data[0])
  File "/home/noboshe/.cache/pypoetry/virtualenvs/sheetshuttle-PeNuXww9-py3.9/lib/python3.9/site-packages/pandas/core/frame.py", line 721, in __init__
    arrays, columns, index = nested_data_to_arrays(
  File "/home/noboshe/.cache/pypoetry/virtualenvs/sheetshuttle-PeNuXww9-py3.9/lib/python3.9/site-packages/pandas/core/internals/construction.py", line 519, in nested_data_to_arrays
    arrays, columns = to_arrays(data, columns, dtype=dtype)
  File "/home/noboshe/.cache/pypoetry/virtualenvs/sheetshuttle-PeNuXww9-py3.9/lib/python3.9/site-packages/pandas/core/internals/construction.py", line 883, in to_arrays
    content, columns = _finalize_columns_and_data(arr, columns, dtype)
  File "/home/noboshe/.cache/pypoetry/virtualenvs/sheetshuttle-PeNuXww9-py3.9/lib/python3.9/site-packages/pandas/core/internals/construction.py", line 985, in _finalize_columns_and_data
    raise ValueError(err) from err
ValueError: 2 columns passed, passed data had 5 columns