analyse_top_x_snapshots: does not handle altair.utils.data.MaxRowsError
jgehrcke opened this issue · 3 comments
jgehrcke commented
Seen in my own testing repo:
220514-23:20:27.784 INFO: df[views_unique_norm] min: 0.07142857142857142, max: 8.5
220514-23:20:27.784 INFO: df[views_unique_norm]: use symlog scale, because range > 8
220514-23:20:27.784 INFO: custom time window for top referrer plot: ('2020-12-01', '2022-05-14')
Traceback (most recent call last):
File "/analyze.py", line 1596, in <module>
main()
File "/analyze.py", line 152, in main
analyse_top_x_snapshots("referrer", gen_date_axis_lim((df_vc_agg,)))
File "/analyze.py", line 680, in analyse_top_x_snapshots
chart_spec = chart.to_json(indent=None)
File "/usr/local/lib/python3.10/site-packages/altair/utils/schemapi.py", line 373, in to_json
dct = self.to_dict(validate=validate, ignore=ignore, context=context)
File "/usr/local/lib/python3.10/site-packages/altair/vegalite/v4/api.py", line 2020, in to_dict
return super().to_dict(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/altair/vegalite/v4/api.py", line 374, in to_dict
copy.data = _prepare_data(original_data, context)
File "/usr/local/lib/python3.10/site-packages/altair/vegalite/v4/api.py", line 89, in _prepare_data
data = _pipe(data, data_transformers.get())
File "/usr/local/lib/python3.10/site-packages/toolz/functoolz.py", line 630, in pipe
data = func(data)
File "/usr/local/lib/python3.10/site-packages/toolz/functoolz.py", line 306, in __call__
return self._partial(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/altair/vegalite/data.py", line 19, in default_data_transformer
return curried.pipe(data, limit_rows(max_rows=max_rows), to_values)
File "/usr/local/lib/python3.10/site-packages/toolz/functoolz.py", line 630, in pipe
data = func(data)
File "/usr/local/lib/python3.10/site-packages/toolz/functoolz.py", line 306, in __call__
return self._partial(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/altair/utils/data.py", line 80, in limit_rows
raise MaxRowsError(
altair.utils.data.MaxRowsError: The number of rows in your dataset is greater than the maximum allowed (5000). For information on how to plot larger datasets in Altair, see the documentation
+ ANALYZE_ECODE=1
error: analyze.py returned with code 1 -- exit.
jgehrcke commented
Need to think, and possibly use https://altair-viz.github.io/user_guide/faq.html#disabling-maxrowserror.
jgehrcke commented
Addressed in #53, via three strategies:
- Made use of https://altair-viz.github.io/user_guide/faq.html#disabling-maxrowserror so this specific error does not come up again
- top-n-plots: show top 7 instead of 10 (reducing number of data points to 70 %)
- top-n-plots: show only every fifth data point when the number of data points in a plot grows beyond 3000.
jgehrcke commented