Adding ConciseDateFormatter displays dates from 1970
BennyThadikaran opened this issue · 10 comments
I am trying to add ConciseDateFormatter to mplfinance chart. Below is the working code but the dates are displaying incorrectly. Its showing dates from 1970.
I tried the same thing in matplotlib and the dates display correctly. (See commented code).
I not sure how to get this working?
Python 3.10.12 (Linux Mint 21.2)
matplotlib 3.7.2
mplfinance 0.12.10b0
import matplotlib.pyplot as plt
from mplfinance import plot, show
from matplotlib.dates import AutoDateLocator, ConciseDateFormatter
from pandas import DataFrame, to_datetime
from json import loads
data = loads('{"Open":{"2023-07-24T00:00:00.000":1678.5,"2023-07-25T00:00:00.000":1684.65,"2023-07-26T00:00:00.000":1699.6,"2023-07-27T00:00:00.000":1699.9,"2023-07-28T00:00:00.000":1661.5,"2023-07-31T00:00:00.000":1650.05,"2023-08-01T00:00:00.000":1654.45,"2023-08-02T00:00:00.000":1642.0,"2023-08-03T00:00:00.000":1640.0,"2023-08-04T00:00:00.000":1635.15,"2023-08-07T00:00:00.000":1663.1,"2023-08-08T00:00:00.000":1651.7,"2023-08-09T00:00:00.000":1653.0},"High":{"2023-07-24T00:00:00.000":1684.65,"2023-07-25T00:00:00.000":1699.0,"2023-07-26T00:00:00.000":1699.6,"2023-07-27T00:00:00.000":1703.0,"2023-07-28T00:00:00.000":1668.9,"2023-07-31T00:00:00.000":1656.8,"2023-08-01T00:00:00.000":1667.45,"2023-08-02T00:00:00.000":1651.5,"2023-08-03T00:00:00.000":1651.35,"2023-08-04T00:00:00.000":1656.5,"2023-08-07T00:00:00.000":1663.1,"2023-08-08T00:00:00.000":1655.6,"2023-08-09T00:00:00.000":1654.5},"Low":{"2023-07-24T00:00:00.000":1670.1,"2023-07-25T00:00:00.000":1678.4,"2023-07-26T00:00:00.000":1688.0,"2023-07-27T00:00:00.000":1667.45,"2023-07-28T00:00:00.000":1641.1,"2023-07-31T00:00:00.000":1638.7,"2023-08-01T00:00:00.000":1650.0,"2023-08-02T00:00:00.000":1633.15,"2023-08-03T00:00:00.000":1623.0,"2023-08-04T00:00:00.000":1629.25,"2023-08-07T00:00:00.000":1647.55,"2023-08-08T00:00:00.000":1642.05,"2023-08-09T00:00:00.000":1631.1},"Close":{"2023-07-24T00:00:00.000":1678.4,"2023-07-25T00:00:00.000":1696.6,"2023-07-26T00:00:00.000":1690.7,"2023-07-27T00:00:00.000":1673.15,"2023-07-28T00:00:00.000":1643.5,"2023-07-31T00:00:00.000":1651.2,"2023-08-01T00:00:00.000":1662.25,"2023-08-02T00:00:00.000":1640.5,"2023-08-03T00:00:00.000":1628.65,"2023-08-04T00:00:00.000":1652.2,"2023-08-07T00:00:00.000":1651.25,"2023-08-08T00:00:00.000":1649.9,"2023-08-09T00:00:00.000":1650.5},"Volume":{"2023-07-24T00:00:00.000":16089722.0,"2023-07-25T00:00:00.000":27996298.0,"2023-07-26T00:00:00.000":12397179.0,"2023-07-27T00:00:00.000":29870651.0,"2023-07-28T00:00:00.000":20507842.0,"2023-07-31T00:00:00.000":17282503.0,"2023-08-01T00:00:00.000":17697094.0,"2023-08-02T00:00:00.000":14058161.0,"2023-08-03T00:00:00.000":28836973.0,"2023-08-04T00:00:00.000":18694152.0,"2023-08-07T00:00:00.000":14150459.0,"2023-08-08T00:00:00.000":21886914.0,"2023-08-09T00:00:00.000":16680618.0}}')
df = DataFrame(data)
df.index.name = 'Date'
df.index = to_datetime(df.index)
locator = AutoDateLocator(minticks=12, maxticks=30)
formatter = ConciseDateFormatter(locator)
# Code for Matplotlib
# ax = plt.subplot()
# ax.xaxis.set_major_locator(locator)
# ax.xaxis.set_major_formatter(formatter)
# plt.plot(df.index, df['Close'])
# plt.show()
# Code for Mplfinance
fig, ax = plot(df, type='candle', style='tradingview',
figscale=2, returnfig=True)
ax[1].xaxis.set_major_locator(locator)
ax[1].xaxis.set_major_formatter(formatter)
show()
Try setting show_nontrading=True
when calling `mpf.plot().
See Mplfinance Time Axis Concerns for more information.
If you need to leave show_nontrading=False
(the default value when unspecified) it is likely that you can accomplish the date format you want even without ConiseDateFormatter. Try using the mpf.plot()
kwarg datetime_format=
. You can set this kwarg to any valid strftime()
style format string. This may be a simple way to accomplish what you are trying to do, and you can do it without the need for returnfig=True
nor to interact with the Axes object(s).
If you need show_nontrading=False
, and you still insist on having the ConciseDateFormatter
then you may have to write your own data formatter that first translates from row numbers to actual matplotlib dates, and then uses those dates for the ConsiseDataFormatter. If you need guidance on this, let me know.
show_nontrading
is set to False by default. I tried explicitly setting it but it doesn't make any difference. I am already using datetime_format
in my repo code, it's not the same as the conciseDateFormatter.
But your explanation and the link you provided, helped me understand the problem. The AutoDateLocator
is using the row numbers and treating it as unix timestamps. Its explains the dates from 1970.
So i implement my own AutoDateLocator? Looking at the source code, I have to inherit the DateLocator
class. Am i in the right direction?
Thank you for taking the time to answer my question. 😄
Setting show_nontrading=True
should definitely make a difference.
Try also (in addition to show_nontrading=True
) using ax[0]
, or both ax[0] and ax[1], when setting the locator and formatter:
# try this first:
ax[0].xaxis.set_major_locator(locator)
ax[0].xaxis.set_major_formatter(formatter)
or
# or maybe this:
ax[0].xaxis.set_major_locator(locator)
ax[0].xaxis.set_major_formatter(formatter)
ax[1].xaxis.set_major_locator(locator)
ax[1].xaxis.set_major_formatter(formatter)
Regarding setting datetime_format=
kwarg ... if you are calling .set_major_formatter()
then datetime_format()
may be ignored. Not completely sure but I would have to check the code.
When I have a little more time (perhaps later today) I may try playing with the code myself; and can also look into the details of writing your own locator and/or formatter and get back to you on that.
show_nontrading=True
works in both of those conditions you mentioned. But it makes the chart rather ugly 😄
You can see the chart image from my project. I have set a datetime_format
and rotation
on the date. While I'm satisfied with the output, the conciseDateFormatter
will maximize the real estate on the chart.
I've been going through the source code. The crux of AutoDateLocator is the dunder call method and the get_locator method. I can just play with the outputs. Once i learn enough, i should be able to implement a class to make this work. This isn't urgent just an aesthetic change.
If its OK with you, i can close this issue for the time being and post a solution once i have one.
The following changes to your code should work.
These changes utilize the fact that the x-axis is row numbers under the hood, and we translate those row numbers to datetimes before passing them to the ConciseDateFormatter.
In summary:
AutoDateLocator
is no longer needed. UseMaxNLocator
instead.import date2num
(to convert python datetimes to matplotlib datetimes)import MaxNLocator
- use
MaxNLocator
(instead ofAutoDateLocator
) - Define your own formatter class, derived from
ConciseDateFormatter
. This new formatter takes the list of datetimes upon construction, to be used later to convert row numbers to datetimes. - Usurp the
format_ticks()
method ofConciseDateFormatter
to first convert the row numbers to matplotlib dates, and then pass them to the parentConciseDateFormatter.format_ticks()
method.
Hope that helps.
Hi Daniel,
I tried your code and it's working perfectly. I was hoping to solve this over the next few weeks, but you spent your precious time to solve this for me.
Thank you so much for your time. I will study the code and implement it.
@BennyThadikaran
Benny,
You're welcome. It was actually a lot of fun to figure out how to do this. Took me about an hour and a half of experimenting with the code; was totally worth it. I learned some neat stuff. For example, most formatters work via the __call__
method; but
ConciseDateFormatter
works via the format_ticks
method (which I only discovered after about 45 minutes of playing with the code). In retrospect it makes sense: most formatters format each tick independently of all other ticks, but ConciseDateFormatter
needs to be aware of all the ticks at the same time, because the formatting of some ticks depends in part on the formatting of others.
I'm glad it helped. --Daniel
@DanielGoldfarb
Hi Daniel,
Just wanted to post an update about my final solution. I tried playing with various configurations of the MaxNLocator
but wasn't quite satisfied with end result. I ended up rolling my own custom class DateTickFormatter
using the FixedLocator
and FixedFormatter
. It has a single public method getLabels
which returns a tuple with the initialized locator and formatter. I picked up some inspiration from the AutoDateLocator
to work it out.
It is not an efficient solution and currently works with daily and weekly timeframes. I only plot about 140 to 200 candles, so any performance issues are barely noticeable. I might use rrule in the future to avoid looping the entire length of the dates.
To use it:
locator, formatter = DateTickFormatter(df.index, tf='weekly').getLabels()
for ax in axs:
ax.xaxis.set_major_locator(locator)
ax.xaxis.set_major_formatter(formatter)
show()
I have added the complete DateTickFormatter
code at the bottom, if it helps others.
That said, i still think mplfinance
should work well with the ConciseDateFormatter
. I suspect somewhere in the code, mplfinance is calling date2num
on matplotlib date values while passing the dates to the locator classes. I tried backtracking to find the source of the problem, but had to give up after sometime. 😸 I'll probably keep trying till i find the issue.
I want to thank you again for the time you spend answering these questions. I learned a lot, and feel more confident delving into matplotlib source code.
DateTickFormatter.py
from matplotlib.ticker import FixedFormatter, FixedLocator
class DateTickFormatter:
def __init__(self, dates, tf='daily'):
'''Dates: DatetimeIndex
tf: daily or weekly'''
self.dates = dates
self.len = len(dates)
self.month = self.year = None
self.idx = 0
self.intervals = (2, 4, 7, 14)
self.tf = tf
def _formatDate(self, dt):
'''Returns the formatted date label for the ticker.'''
if dt.month != self.month:
self.month = dt.month
if dt.year != self.year:
self.year = dt.year
return f'{dt:%d\n%Y}'
return f'{dt:%d\n%b}'.upper()
return dt.day
def _getInterval(self):
'''Returns an integer interval at which the ticks will be labelled.'''
idx = 0
while True:
if idx == len(self.intervals) - 1:
return self.intervals[idx]
d = self.len / self.intervals[idx]
if d <= max(self.intervals):
return self.intervals[idx]
else:
idx += 1
def getLabels(self):
'''Returns an instance of FixedLocator and FixedFormatter in a tuple.
Ticker format based on number of candles in Data.
'''
if self.year is None:
self.year = self.dates[0].year
self.month = self.dates[0].month
if self.len <= 22:
return self._daily()
if self.len < 200:
return self._atInterval(self._getInterval())
return self._monthly()
def _daily(self):
'''Labels ticks on every candle'''
labels = []
for dt in self.dates:
if self.tf == 'daily' and dt.weekday() > 4:
continue
labels.append(self._formatDate(dt))
return (FixedLocator(tuple(range(self.len))), FixedFormatter(labels))
def _monthly(self):
'''Labels ticks on 1st Candle of every month and year'''
labels = []
ticks = []
for i, dt in enumerate(self.dates):
if dt.month != self.month:
self.month = dt.month
if dt.year != self.year:
self.year = dt.year
labels.append(dt.year)
else:
labels.append(f'{dt:%b}'.upper())
ticks.append(i)
elif i == 0:
labels.append(f'{dt:%b\n%Y}'.upper())
ticks.append(i)
return (FixedLocator(ticks), FixedFormatter(labels))
def _atInterval(self, interval):
'''Labels ticks at every interval of candle dates'''
labels = []
ticks = []
nextTick = interval
for i, dt in enumerate(self.dates):
if i == 1:
labels.append(self._formatDate(dt))
ticks.append(i)
elif i == self.len - 1:
break
elif i == nextTick:
ticks.append(i)
labels.append(self._formatDate(dt))
nextTick += interval
i += 1
return (FixedLocator(ticks), FixedFormatter(labels))
@BennyThadikaran
Benny,
Thanks for sharing. That looks really good! All the best. --Daniel
I just wanted to provide an update. I tried to figure out a fix for this issue. The core issue is that num2date
function is being called twice on the existing matplotlib dates. Once within the mpf.plot
function and thereafter when mpf.show
is called after adding Locator and Formatter. Since num2date
is being called on already converted timestamps we see the weird dates from 1970s. I couldnt find the exact source or a solution, but i did manage a workaround to using ConciseDateFormatter
Posting it here, so others may find it helpful.
I did have to create a custom format_coords
function, otherwise works as expected.
import mplfinance as mpf
import matplotlib.dates as mdates
import matplotlib.ticker as ticker
import pandas as pd
# Sample OHLC data
data = {
"date": pd.date_range(start="2022-01-01", end="2022-01-10", freq="D"),
"open": [100, 110, 95, 105, 98, 100, 110, 95, 105, 98],
"high": [120, 115, 100, 110, 105, 120, 115, 100, 110, 105],
"low": [90, 105, 90, 98, 92, 90, 105, 90, 98, 92],
"close": [110, 100, 92, 100, 100, 110, 100, 92, 100, 100],
}
df = pd.DataFrame(data)
df.set_index("date", inplace=True)
# Create the mplfinance chart
fig, axs = mpf.plot(
df,
type="candle",
style="tradingview",
title="OHLC Chart",
returnfig=True,
xrotation=0, # no rotation required
)
# Locator sets the major tick locations on xaxis
locator = mdates.AutoDateLocator(minticks=3, maxticks=7)
# Formatter set the tick labels for the xaxis
concise_formatter = mdates.ConciseDateFormatter(locator=locator)
# Extract the tick values from locator.
# These are matplotlib dates not python datetime
tick_mdates = locator.tick_values(df.index[0], df.index[-1])
# Extract the ticks labels from ConciseDateFormatter
labels = concise_formatter.format_ticks(tick_mdates)
ticks = []
# Convert the matplotlib dates to python datetime and iterate
for dt in mdates.num2date(tick_mdates):
# remove the timezone info to match the DataFrame index
dt = dt.replace(tzinfo=None)
# Get the index position if available
# else get the next available index position
if dt in df.index:
idx = df.index.get_loc(dt)
else:
idx = df.index.searchsorted(dt, side="right")
# store the tick positions to be displayed on chart
ticks.append(idx)
# Initialise FixedFormatter and FixedLocator
# passing the tick labels and tick positions
fixed_formatter = ticker.FixedFormatter(labels)
fixed_locator = ticker.FixedLocator(ticks)
fixed_formatter.set_offset_string(concise_formatter.get_offset())
for ax in axs:
ax.xaxis.set_major_locator(ticker.FixedLocator(ticks))
ax.xaxis.set_major_formatter(fixed_formatter)
ax.format_coord = format_coords
mpf.show()