gadget-framework/mfdb

Model variant folders

Closed this issue · 5 comments

A variation on gadget directory that forks the mainfile, and puts any newly written content in the variant directory.

   GD
   |- mainfile
   |- variant/
       |- mainfile (starts same, with relative links to data files, modifying causes files to appear to variant folder)
bthe commented

To see example of the type of analysis that require these model variants you can refer to gadget.ypr and gadget.forward from https://github.com/bthe/rgadget/blob/master/R/gadgetfunctions.R

Where is this being implemented?

mfdb gadget_* functions

Here implementing will be similar to how we discussed, the gadget_directory has a notion of which mainfile it's using, it's just not exposed. However IIRC we're trying to get new functionality (especially non-MFDB-specific functionality) into Rgadget

Rgadget gadgetfile

As it stands this code is pretty similar to mfdb, although there's no gadget_directory object, but the intention was that the various functions could accept a path that has "mainfile" attributes. Of course to be useful more of mfdb's gadget functionality will need to be ported, e.g. R/gadget_likelihood_component.R, but that's not a major drama.

This is also somewhat incompatible with recursive reading of data files, in particular if you read an entire gadget model file from the mainfile down.

  • The structure of gadget_update is designed for working on sections of file and writing back, the interface would have to be changed first before we started to think about model variants.
  • Allowing things like:
gm <- gadget_model('mainfile')
gm$areafile[[1]]$size <- 5e5
write_this_as_a_variant(gm)

...to work would be very hard(tm) since you'd have to track what files are dirty (and thus a variant) and what isn't.

So how important is reading in a whole model as one structure, a would-be-nice or how everything in Rgadget should work?

bthe commented

I generally see four different use cases with the gadget_update functionality:

  • Define a model directly using R and write out the associated files.
  • Update an existing model that has not been defined using the R routines
  • Define model variants, such as done in gadget.forward, gadget.ypr and probably some ad-hoc runs
  • Extract information from the model and calculate summary statistics (e.g. gadget.fit and gadget.iterative)

These four cases do not involve a lot of in-place editing of the whole model object, but it is convenient to be able to extract summaries from the files. The first two point, I guess we are ok with not reading in the whole model. Point 3 would go though model variants but I would like to be able to use the update mechanism. Point 4 is more of a spaghetti. In gadget.iterative the weights components of the likelihood file are manipulated, different weighting alternatives are save into a WGTS folder (variant folder) and several instances of gadget run simultaneously to find the optimal weighting scheme. The file structure is something like this:

WGTS
|- main.comp1
|- likelihood.comp1
|- lik.comp1
|- params.comp1
|- ...
|- likelihood.compn

where main.* and likelihood.* are generated by R and lik.* and params.* are output from gadget. The weights are determined from the output and the likelihood data.

gadget.fit has mainly two functions:

  • Assess the fit of the model, it needs to create a printfile (make.gadget.printfile does that currently) telling gadget what output is needed, read in the likelihood data and join that with the output
  • Extrapolate management summaries, i.e. recruitment (which is usually read from the renewal file), selectivity (from fleet files), biomass, catch and fishing mortality. This is a combination reading in and evaluating gadget formulae..

gadget.fit and gadget.iterative currently begin by reading and writing model files, so if we could associate files with different main files this would allow what is done currently.

@bthe On more practical matters, what are the paths generally relative to? The gadget user guide states that file paths in the mainfile are relative to the location of the mainfile, but there's no hint about files beyond that.

I have a gadget-fit-example-02-mfdbcod model from you at some point with PRE and YPR subdirectories, which look like they're variants. However PRE/main.pre has fleetfiles Modelfiles/fleets ./PRE/fleet (I would have expected ../Modelfiles/fleets and fleet). ./PRE/fleet also contains amount ./PRE/fleet.pre. Is this a bug, or is gadget interpreting these paths in comparison to what I presume is the current directory, instead of relative to the mainfile location?

bthe commented

Actually there are two ways of setting the path origin, the default is the directory where gadget is called but you can also set an environment variable "GADGET_WORKING_DIR". Interestingly there is also something called "GADGET_DATA_DIR" (see https://github.com/Hafro/gadget/blob/d9c7a209c2c0504e9e31bbe4bb47ee3eb569b5c2/src/gadget.cc lines 27 and 38)