gianlucadetommaso/volatile

pd.concat problem when adding more tickers

Closed this issue · 3 comments

I have this exception when added more tickers to the symbols_list.txt
Not sure if it related to missing data or not

2021-02-15 12:48:43.428739: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.1

Downloading all available closing prices in the last year...
[---------------------100%-----------------------]  7115 of 7115 completed
Traceback (most recent call last):
  File "volatile.py", line 296, in <module>
    data = download(args.symbols)
  File "/content/volatile/download.py", line 115, in download
    data = pd.concat(data.values(), keys=data.keys(), axis=1, sort=True)
  File "/usr/local/lib/python3.6/dist-packages/pandas/core/reshape/concat.py", line 287, in concat
    return op.get_result()
  File "/usr/local/lib/python3.6/dist-packages/pandas/core/reshape/concat.py", line 503, in get_result
    mgrs_indexers, self.new_axes, concat_axis=self.bm_axis, copy=self.copy,
  File "/usr/local/lib/python3.6/dist-packages/pandas/core/internals/concat.py", line 84, in concatenate_block_managers
    return BlockManager(blocks, axes)
  File "/usr/local/lib/python3.6/dist-packages/pandas/core/internals/managers.py", line 149, in __init__
    self._verify_integrity()
  File "/usr/local/lib/python3.6/dist-packages/pandas/core/internals/managers.py", line 329, in _verify_integrity
    raise construction_error(tot_items, block.shape[1:], self.axes)
ValueError: Shape of passed values is (252, 13288), indices imply (251, 13288)

you can reproduce it with this colab notebook
https://colab.research.google.com/drive/1S9B-iWQn59y7FD4V2WBv3cHaDzPssxmX?usp=sharing

Thanks!

Thanks for the issue and making it reproducible. I'll have a closer look asap.

@TonyTang1997 Fixed. The issue was that for at least one ticker (e.g. DBTX) the data is bugged and comes with duplicate dates but different prices and volumes. I now remove potential duplicate indices. It should work.

thanks a lot. It is working now