matplotlib/mplfinance

Data for column "Open" must be ALL float or int.

TG-GOD opened this issue · 5 comments

TG-GOD commented
!pip install mplfinance
import pandas as pd
import yfinance as yf
import mplfinance as mpf

#get data
symbol = "AAPL"
start_date = "2022-01-01"
end_date = "2022-12-31"
stock_data = yf.download(symbol, start=start_date, end=end_date)

# index
stock_data.index = pd.to_datetime(stock_data.index)

# K_line
mpf.plot(stock_data, type='candle', style='yahoo', title=f'{symbol} ')

get the error

[*********************100%***********************]  1 of 1 completed
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
[<ipython-input-5-7f9b0b934c33>](https://localhost:8080/#) in <cell line: 17>()
     15 
     16 # K_line
---> 17 mpf.plot(stock_data)

1 frames
[/usr/local/lib/python3.10/dist-packages/mplfinance/_arg_validators.py](https://localhost:8080/#) in _check_and_prepare_data(data, config)
     72     for col in cols:
     73         if not all( isinstance(v,(float,int)) for v in data[col] ):
---> 74             raise ValueError('Data for column "'+str(col)+'" must be ALL float or int.')
     75 
     76     if config['tz_localize']:

ValueError: Data for column "Open" must be ALL float or int.

@TG-GOD
Clearly the problem is with the "Open" column in your dataframe. You can see from the code just before the exception is raised that the code is checking to see if all your values are floats or ints (if not all( isinstance(v,(float,int)) for v in data[col] ):) and clearly they are not.

You should take a look at your dataframe stock_data and see what's in there. Obviously something is wrong with it. Perhaps there is some text in places other than in the header of the dataframe?

Hi,
I came across the same error message

File "/home/tas/Work/drc/finfin/.venv/lib/python3.10/site-packages/mplfinance/plotting.py", line 417, in plot
dates,opens,highs,lows,closes,volumes = _check_and_prepare_data(data, config)
File "/home/tas/Work/drc/finfin/.venv/lib/python3.10/site-packages/mplfinance/_arg_validators.py", line 74, in _check_and_prepare_data
raise ValueError('Data for column "'+str(col)+'" must be ALL float or int.')

when my yfinance data fails on a simple plot command

reading the code in "_arg_validators.py", I suggest this code needs revision

for col in cols: if not all( isinstance(v,(float,int)) for v in data[col] ): raise ValueError('Data for column "'+str(col)+'" must be ALL float or int.')
just above this check, the actual arrays of values are already declared
opens = data[o].values highs = data[h].values lows = data[l].values closes = data[c].values if v in data.columns: volumes = data[v].values cols.append(v) else: volumes = None

so why not check the .values for float or integers?

When running in the Jetbrains Pycharm debugger, the data[col] contains a column of date string and then another of float values - should this NOT be passed?

Ticker AAPL Date 2024-10-01 229.267771 2024-10-02 225.641756 2024-10-03 224.892566 2024-10-04 227.649533 2024-10-07 224.253275 2024-10-08 224.053495 2024-10-09 224.982474 2024-10-10 227.529674 2024-10-11 229.048004 2024-10-14 228.448658 2024-10-15 233.353261 2024-10-16 231.345474 2024-10-17 233.173459 2024-10-18 235.920426 2024-10-21 234.192340 2024-10-22 233.632963 2024-10-23 233.822752 2024-10-24 229.727257 2024-10-25 229.487523 2024-10-28 233.063595 2024-10-29 232.843827 2024-10-30 232.354358 2024-10-31 229.087951

Thanks
Todd

@darthracing

Todd,

There are a couple of things going on here.

First, please understand that the code is checking each column in the dataframe, among the columns 'Open', 'High', 'Low', and 'Close' (and 'Volume' if 'Volume' exists), one column at a time, to ensure that all of the values in a given are either integers or floats. No other types are allowed as 'Open', 'High', 'Low', 'Close', or 'Volume'.

Regarding your first question, "why not check the .values for float or integers?" the answer is that it is the same:

The line of code doing the check is:

if not all( isinstance(v,(float,int)) for v in data[col] )

This is exactly the same as

if not all( isinstance(v,(float,int)) for v in data[col].values )

Regarding your question "the data[col] contains a column of date string and then another of float values - should this NOT be passed?" ... you seem to misunderstand the point of the check and the error message.

If you would include the entire output from the exception, you would see that the exception tells you specifically which column is not all integers and floats and thus you could look into your data and fix the offending data point(s).

I hope the above explanation helps you to find the source of the problem (why you are getting an exception). If not, then if you can include the rest of your code, and the complete output from the thrown exception, then I will gladly help you debug where the issue is in your code.

All the best. --Daniel

Thank you for the response.
My specific issue turned out to be input that was multi-dimensional when it should have been flattened; the fact there were 2 subcolumns of data was the source of my problem, posting the offending value showed "AAPL", so yes the filter did its job

Using the yahoo.finance module, I had to add "multi_level_index=False" to correct the example I was debugging
ie:

daily_data = yf.download(ticker, start=start_date, end=end_date, multi_level_index=False)

@darthracing
Thanks for the update; especially as the information may help others. Much appreciated.