pegasystems/pega-datascientist-tools

plotOverTime errors when too many Configuration facets

operdeck opened this issue · 2 comments

pdstools version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pdstools.

Issue description

In the health check we have a few calls to plot over time with a Configuration facet. If there are many configs, the facetting breaks. For now try/catched around this but should solve properly.

First is a warning, next a ValueError

Warning: plotting this much data (969 rows) will probably be slow while not providing many insights. Consider filtering the data by either limiting the number of models, filtering on SnapshotTime or facetting.

ValueError Vertical spacing cannot be greater than (1 / (rows - 1)) = 0.066667.
The resulting plot would have 16 rows (rows=16).
Use the facet_row_spacing argument to adjust this spacing.

Possibly too many facets: 46.

Reproducible example

try:
    fig = datamart.plotOverTime(
        # TODO: the faceting errors out when there are many configurations
        "weighted_performance", by="Channel/Direction", facets=facet, facet_col_wrap=facet_col_wrap
    )
    fig = (
        fig.update_layout(autosize=True, height=height, title="Trend of Model Performance")
        .for_each_annotation(lambda a: a.update(text=a.text.replace(f"{facet}=", "")))
        .update_yaxes(showticklabels=True, title="")
        .update_xaxes(title="")
    )

    fig.show()
except ValueError as e:
    print(f"Error {str(e)}\nPossibly too many facets: {unique_count}.")

Expected behavior

No errors

Installed versions

Detailed version info for pdstools:

---Version info---
pdstools: 3.3.0
Platform: macOS-10.16-x86_64-i386-64bit
Python: 3.11.5 (main, Sep 11 2023, 08:19:27) [Clang 14.0.6 ]

---Dependencies---
plotly: 5.17.0
requests: 2.31.0
pydot: 1.4.2
polars: 0.20.2
pyarrow: 13.0.0
tqdm: 4.66.1
pyyaml:
aioboto3: 11.3.0

---Streamlit app dependencies---
streamlit: 1.31.0
quarto:
papermill: 2.4.0
itables: 1.6.1
pandas: 2.2.1
jinja2: 3.1.3
xlsxwriter: 3.1.9

How should we handle this you think @operdeck, use the same ValueError handling you're doing in the given example in the core code itself?

Suggestion: have an on_error argument with options 'skip', 'warn', 'raise'.