CamDavidsonPilon/lifelines

Plotting KM Survival Functions not working

jcursons opened this issue · 6 comments

Hi,

I've recently updated lifelines and it has broken all of my previous plotting functions. Unfortunately I can't even seem to get your test code working, as per https://lifelines.readthedocs.io/en/latest/fitters/univariate/KaplanMeierFitter.html

waltons = load_waltons()
kmf = KaplanMeierFitter(label="waltons_data")
kmf.fit(waltons['T'], waltons['E'])
kmf.plot()

This gives the error:

Traceback (most recent call last):
  File "C:\python\venv\Lib\site-packages\IPython\core\interactiveshell.py", line 3526, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-3-b0199a766265>", line 3, in <module>
    kmf.plot()
  File "C:\python\venv\Lib\site-packages\lifelines\fitters\kaplan_meier_fitter.py", line 448, in plot
    return self.plot_survival_function(**kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\python\venv\Lib\site-packages\lifelines\fitters\kaplan_meier_fitter.py", line 453, in plot_survival_function
    return _plot_estimate(self, estimate="survival_function_", **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\python\venv\Lib\site-packages\lifelines\plotting.py", line 919, in _plot_estimate
    dataframe_slicer(plot_estimate_config.estimate_).rename(columns=lambda _: plot_estimate_config.kwargs.pop("label")).plot(
  File "C:\python\venv\Lib\site-packages\pandas\plotting\_core.py", line 975, in __call__
    return plot_backend.plot(data, kind=kind, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\python\venv\Lib\site-packages\pandas\plotting\_matplotlib\__init__.py", line 71, in plot
    plot_obj.generate()
  File "C:\python\venv\Lib\site-packages\pandas\plotting\_matplotlib\core.py", line 451, in generate
    self._adorn_subplots()
  File "C:\python\venv\Lib\site-packages\pandas\plotting\_matplotlib\core.py", line 676, in _adorn_subplots
    handle_shared_axes(
  File "C:\python\venv\Lib\site-packages\pandas\plotting\_matplotlib\tools.py", line 404, in handle_shared_axes
    layout[row_num(ax), col_num(ax)] = ax.get_visible()
           ^^^^^^^^^^^
  File "C:\python\venv\Lib\site-packages\pandas\plotting\_matplotlib\tools.py", line 393, in <lambda>
    row_num = lambda x: x.get_subplotspec().rowspan.start
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'rowspan'

I get a similar error when I try run my (previously working) functions on my own data.

The actual point of failure appears to be line 919 in plotting.py:

dataframe_slicer(plot_estimate_config.estimate_).rename(columns=lambda _: plot_estimate_config.kwargs.pop("label")).plot(
        logx=plot_estimate_config.logx, **plot_estimate_config.kwargs)

If I run up to this point and check some of the objects in memory everything appears to be in order:

dataframe_slicer(plot_estimate_config.estimate_)
Out[2]: 
          KM_estimate
timeline             
0.0          1.000000
6.0          1.000000
81.0         1.000000
111.0        1.000000
122.0        0.993103
...               ...
6699.0       0.216096
7514.0       0.162072
7563.0       0.162072
10346.0      0.081036
11252.0      0.081036
[145 rows x 1 columns]


plot_estimate_config.logx
Out[3]: False


plot_estimate_config.kwargs
Out[4]: 
{'ax': <Axes: >,
 'color': '#1f77b4',
 'drawstyle': 'steps-post',
 'label': 'KM_estimate'}

But it still falls over:

Traceback (most recent call last):
  File "C:\python\venv\Lib\site-packages\pandas\plotting\_matplotlib\tools.py", line 404, in handle_shared_axes
    layout[row_num(ax), col_num(ax)] = ax.get_visible()
           ^^^^^^^^^^^
  File "C:\python\venv\Lib\site-packages\pandas\plotting\_matplotlib\tools.py", line 393, in <lambda>
    row_num = lambda x: x.get_subplotspec().rowspan.start
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'rowspan'

Despite using python for a number of years I struggle to 'read' lambda functions in my head, so unfortunately I don't even know where to start with fixing this unless I start replacing large code chunks

I've done a bit more debugging/testing and it seems that the issue arises due to my habit of plotting large multipanel figures.

If I run this, it works:

        waltons = load_waltons()

        handFig = plt.figure(figsize=(5,5))

        handAx= handFig.add_axes([0.1, 0.7, 0.8, 0.2])

        kmf = KaplanMeierFitter(label='waltons_data')
        kmf.fit(waltons['T'], waltons['E'])
        kmf.plot(ax=handAx)


        handFig.savefig(os.path.join(PathDir.pathOut, 'test.png'), dpi=300)
        plt.close(handFig)

But if I try to add a second axis to the figure, it crashes with the error listed above:


        waltons = load_waltons()

        handFig = plt.figure(figsize=(5,5))

        handAx= handFig.add_axes([0.1, 0.7, 0.8, 0.2])

        kmf = KaplanMeierFitter(label='waltons_data')
        kmf.fit(waltons['T'], waltons['E'])
        kmf.plot(ax=handAx)


        handAx2 = handFig.add_axes([0.1, 0.2, 0.8, 0.2])

        kmf2 = KaplanMeierFitter(label='waltons_data')
        kmf2.fit(waltons['T'], waltons['E'])
        kmf2.plot(ax=handAx2)

        handFig.savefig(os.path.join(PathDir.pathOut, 'test.png'), dpi=300)
        plt.close(handFig)

I suspect that using the built in subplot spec is a bit too smart for my approach of just dropping in axes at specified positions using handFig.add_axes() (even if I specify the axis handle through the ax input parameter)

Hm, is it a pandas thing? We bumped min pandas from 1.0 to 1.2. What version of pandas have you been using?

Yeah, I'm having the same problem.
This example shows no plots

from lifelines.statistics import survival_difference_at_fixed_point_in_time_test
from lifelines import KaplanMeierFitter
from lifelines.datasets import load_waltons

df = load_waltons()
ix = df['group'] == 'miR-137'
T_exp, E_exp = df.loc[ix, 'T'], df.loc[ix, 'E']
T_con, E_con = df.loc[~ix, 'T'], df.loc[~ix, 'E']

kmf_exp = KaplanMeierFitter(label="exp").fit(T_exp, E_exp)
kmf_con = KaplanMeierFitter(label="con").fit(T_con, E_con)

point_in_time = 10.
results = survival_difference_at_fixed_point_in_time_test(point_in_time, kmf_exp, kmf_con)
results.print_summary()

kmf_exp.plot_survival_function(point_in_time=point_in_time)
kmf_con.plot_survival_function(point_in_time=point_in_time)

I can copy-paste that snippet into ipython, but I also need to add plt.show for it to appear.

Apologies for the delayed response, but great suggestion thanks @CamDavidsonPilon - updating pandas from 2.0.1 to 2.2.0 has fixed the issue that I was having!

yup, all working here too. Thanks for the quick response.