backtrader2/backtrader

[tradingcal] [bugfix] Index.searchsorted() method accepts only pandas timestamps in v1.0 of the library.

vladisld opened this issue ยท 6 comments

Description

PandasMarketCalendar class uses the pandas dataframe to cache the trading dates schedule. Unfortunately in schedule method a wrong data type is used to search in this cache in case the pandas library v1.0 is used.

def schedule(self, day, tz=None):
        '''
        Returns the opening and closing times for the given ``day``. If the
        method is called, the assumption is that ``day`` is an actual trading
        day

        The return value is a tuple with 2 components: opentime, closetime
        '''
        while True:
            #### The problem is here #####
            i = self.idcache.index.searchsorted(ay.date())

            if i == len(self.idcache):
                # keep a cache of 1 year to speed up searching
                self.idcache = self._calendar.schedule(day, day + self.csize)
                continue

            st = (x.tz_localize(None) for x in self.idcache.iloc[i, 0:2])
            opening, closing = st  # Get utc naive times
            if day > closing:  # passed time is over the sessionend
                day += ONEDAY  # wrap over to next day
                continue

            return opening.to_pydatetime(), closing.to_pydatetime()

Unexpected behavior

I don't remember exactly whether or not an exception is raised or the search just return empty result

I think you're right. day.date() needs to go to pandas datetime. The pandas module library gets imported in init of PandasMarketCalendar. I set an attribute to the library like this at line 236:

import pandas as pd  # guaranteed because of pandas_market_calendars
self.pandas = pd

And then use this to convert the day.date() value:

i = self.idcache.index.searchsorted(self.pandas.to_datetime(day.date()))

Seems to fix the error but I haven't checked if it's working yet. Does this method make sense or is there a better way to convert the date?

Mkima commented

I think you're right. day.date() needs to go to pandas datetime. The pandas module library gets imported in init of PandasMarketCalendar. I set an attribute to the library like this at line 236:

import pandas as pd  # guaranteed because of pandas_market_calendars
self.pandas = pd

And then use this to convert the day.date() value:

i = self.idcache.index.searchsorted(self.pandas.to_datetime(day.date()))

Seems to fix the error but I haven't checked if it's working yet. Does this method make sense or is there a better way to convert the date?

Hi, I've verified this one works for me as well. who can help committing this fix?

I'm using the following patch in my fork - seems to be working for my system.

i = self.idcache.index.searchsorted(pd.Timestamp(day.date()))

Anyway, an appropriate test should be provided with any PR.

It seems that timezones are not handled in the schedule method: it ignores tz parameter, expects tz-naive date, and returns times in UTC.

Internal _calendar though does not seem to have any issues with dates or timezones. Below code works, but public API would be preferred of course:

schedule = nyse_calendar._calendar.schedule(market_day, market_day, eastern_tz)
market_open = schedule['market_open'][0].time()
market_close = schedule['market_close'][0].time()

I have started to use trading calendar with pandas_trading_calendar, the solution proposed fixed the issue for me.
Thanks

Line no. 251 in the _nextday method in tradingcal.py throws this error.

TypeError: value should be a 'Timestamp', 'NaT', or array of those. Got 'date' instead.

A quickfix that worked is converting the day to a pandas Timestamp:

# original
i = self.dcache.searchsorted(day)

# fix
i = self.dcache.searchsorted(pd.Timestamp(day))

In case your calendar is running on a different timezone, you're likely to encounter something like this as well:

TypeError: Cannot compare tz-naive and tz-aware datetime-like objects

Here's a workaround for it.

i = self.dcache.searchsorted(pd.Timestamp(day).tz_localize(self.dcache.tz))

Note: I've imported pandas within the module level imports.

The above fixes work well for me.
Do give it a check