nismod/smif

Find out for which model runs results are available

willu47 opened this issue · 8 comments

  • Add a --results subcommand (or equivalent) to the smif list command which indicates which model runs have results and...
  • whether the results are complete or not (i.e has been run successfully so that the results have a similar timestamp and all results files are present for all outputs listed in config)
  • note that this raises the issue of having an incomplete implementation of a smif wrapper which fails to call the set_results method for a particular output, despite that output being listed in the config

Parent issue #350

c3c0edd begins work on this issue, though somehow my commit message got lost in transit - not sure how that happened!

This currently:

  • is invoked with smif available_results
  • lists all possible outputs broken down by model run -> sos model -> sector model, and whether an output exists for each timestep

But does not yet address:

  • the existence of decision iterations
  • any checking of timestamps

An example output looks like:

model run: energy_central
  - sos model: energy
    - sector model: energy_demand
      - output: cost ................ no results
      - output: water_demand ........ no results

model run: energy_water_cp_cr
  - sos model: energy_water
    - sector model: water_supply
      - output: cost ................ results: 2010, 2015, 2020
      - output: energy_demand ....... results: 2010, 2015, 2020
      - output: water ............... results: 2010, 2015, 2020
      - output: reservoir_level ..... results: 2010, 2015, 2020
    - sector model: energy_demand
      - output: cost ................ results: 2010, 2015, 2020
      - output: water_demand ........ results: 2010, 2015, 2020

Is this along the lines of what you are envisaging, @willu47?

This looks great! A few comments/suggestions:

The output is very verbose, so this much information should probably be optional. With 100 model runs in a project, it would be difficult to parse this info. So how about:

  • presenting a much terser version which is integrated under the smif list command - for example, an asterix * could be appended to those model runs which have been run and which have full results available
  • adding a run name argument to the available_results command which presents the full read out of results for one model run only? Omitting the modelrun would return the read out for all model runs in the project.

A note in iterations: iteration and year pairs should be treated a ids for the results - in that in some cases, one iteration will include multiple years, and in other cases one year may be iterated over multiple times.

There is some functionality for timestamp checking in the warm start functionality in smif. Again, check cli/__init__.py to follow that functionality in from the command.

By 'timestamp' here do you mean the time when the results were generated? Or just the timesteps that exist in the results directory?

The latter is what seems to be being checked by the warm start functionality - but just wanted to clarify.

A note in iterations: iteration and year pairs should be treated a ids for the results - in that in some cases, one iteration will include multiple years, and in other cases one year may be iterated over multiple times.

So, given this it's the case that the following are both possible, given a model run with 6 timesteps:

d0  d1  d2
----------
t0  t2  t5
t1  t3  t6
    t4
d0  d1  d2
----------
t0  t0  t2
t1  t1  t3
    t2  t4
    t3  t5
        t6

@willu47 if we have this right, can we determine for sure whether a model run has completed successfully? It's not simply that every timestep exists across all decisions, nor that every timestep exists in a single decision? Is it, for instance, sufficient that the last timestep exists in the last decision?

I think it is sufficient that results for all the timesteps listed in the run exist in the results. I'm not sure on the last

Is it, for instance, sufficient that the last timestep exists in the last decision?

It could be that a user-implemented decision algorithm works backwards from last to first timestep while iterations increase...

By 'timestamp' here do you mean the time when the results were generated? Or just the timesteps that exist in the results directory?

The latter is what seems to be being checked by the warm start functionality - but just wanted to clarify.

Okay, looks like that is red herring. Sorry!

The timestamps work might be something to kick into the future, as it could be very complicated, particularly if results are produced by a scheduler working in parallel, rather than the basic serial scheduler currently implemented in smif.

Closed by #358