ProjectDrawdown/solutions

Redo ModelHealth dashboard

Closed this issue · 5 comments

There is a ModelHealth.ipynb Jupyter notebook which displays information about the health of the system such as the number of solutions which exist in Python versus Excel, how many scenarios they have, and so on. This was done in Jupyter solely because the author was comfortable using Jupyter.

It would be better were the Model Health information available via a static web interface with the various graphs and charts, without having to run Jupyter. For example, there is a "Update survey health files" GitHub Action which each day runs the scripts in tools/survey to generate the information which ModelHealth.ipynb uses. That GitHub Action could be extended to generate a static website hosted in GitHub Pages, and it would be updated automatically whenever the cronjob runs.

@DentonGentry happy to look into that if no one else is already (?)

No-one has contacted me about this, to my knowledge no-one is currently working on it. By all means please feel free.

Hi @DentonGentry , I started looking into this issue, and found something I'm unsure about in the notebook.

In order to compute the breakdown zero/non zero for regional tam, we do:

surveydata = pd.read_csv(os.path.join('data', 'health', 'survey.csv'), index_col=False,
                            skipinitialspace=True, header=0, skip_blank_lines=True, comment='#')
regional_nonzero_tam = surveydata.loc[surveydata['RegionalFractionTAM'] != 0.0]
nonzero_count = regional_nonzero_tam.shape[0]
zero_count = num_scenarios - nonzero_count

The column RegionalFractionTAM has NaNs, values equal to 0 and values higher than 0. such that:

  • nonzero_count includes all values higher than 0 as well as missing values (nan)
  • zero_count includes only the values that exist and are equal to 0.

Is that the behaviour we want? Or should nans be counted separately? Or merged into the zero count?

The same happens with RegionalFractionAdoption too

The intent of that was to count the number of solutions which have solid regional data versus solutions which only have data for the World as a whole. I'd say that if the regional data is zero or NaN that this means there is no regional data, so we'd want to include them in zero_count. I don't think there is a reason to separate NaN out into its own count, we're just trying to show how many solutions have regional data versus not.

To be honest though, it will be most valuable just to have something, anything, working in a way that is quick and easy to access. ModelHealth.ipynb is really only being used by me, as some of the people who would benefit from being able to see it are just not comfortable running a Jupyter notebook locally. Having it hosted somewhere will be a big help, and we can adjust the contents of it as we go along.

Implemented by @klemag in #137
Model health dashboard currently published at: https://projectdrawdown.github.io/solutions/