Visual Analysis Example throws an error

Question

Visual Analysis Example throws an error

ghsher opened this issue 2 years ago · 3 comments

https://github.com/quaquel/EMAworkbench/blob/b8748d9db912b80a25cd3f25b0f309187fb13dc5/docs/source/indepth_tutorial/open-exploration.ipynb

Running the example code for "Visual Analysis" provided in the extended tutorial for "Open Exploration" produces the following error:

  File ".../python3.11/site-packages/ema_workbench/analysis/plotting_util.py", line 646, in make_continuous_grouping_specifiers
    step = (maximum - minimum) / nr_of_groups
           ~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
TypeError: unsupported operand type(s) for /: 'int' and 'NoneType'

Following the call-stack, it seems like if ever pairs_plotting.pairs_scatter is called with a group_by but not a grouping_specifiers, then grouping_specifiers defaults to None, and you will get the above TypeError. The error won't occur when the policy (or whatever column is provided) has dtype=category, but it seems that is not the dtype of policy. I might be wrong on this and there might be some cases where it works, but it doesn't seem like it to me.

INFO: I am running this code after loading saved results using load_results(). It could be that the real issue is that the dtype is lost in this process.

Suggestion: Add grouping_specifiers = range(policies) to the pairs_scatter call in the example for robustness.

EDIT: removed suggestion because I think that would cause errors when the dtype is correct. See discussion below; issue is in the dtype changing when save/load_results are used.

EwoutH commented 2 years ago

Perfect!

Answer 1 · 2023-06-14T08:52:54.000Z

Thanks for filing a detailed report and also including a suggestion!

I can reproduce this error when adding this before the Visual analysis part:

# Save the results, so load_results() can be used later
from ema_workbench import save_results
save_results((experiments, outcomes), '1000 scenarios 5 policies.tar.gz')

# Load the results
from ema_workbench import load_results
experiments, outcomes = load_results('1000 scenarios 5 policies.tar.gz')

Otherwise I can't reproduce it, so it's indeed the save and load results part.

@quaquel It seems that saving and loading the experiments dataframe changes the dtype from CategoricalDtype(categories=[105, 106, 107, 108, 109], ordered=False) to 'int64'.

My solution would be using Pickles instead of tar.gz files for save_results() and load_results(), since they save dataformats by default. This might be a breaking change for some users, however.

Answer 2 · 2023-06-14T08:55:59.000Z

Pickle a big no because it is a non-portable non-human readable format.

The problem is somewhere in the load_results and not fully correctly parsing the datatype metadata. I'll investigate hopefully later today.