gadget-framework/mfdb

NA's in `mfdb_sample_meanlength_stddev`

Closed this issue · 4 comments

bthe commented

After spending half an hour debugging a Gadget model I realised that mfdb_sample_meanlength_stddev returns NA in the stddev column when the only one observation of length for a particular age group is found (line 163 in the datafile). This subsequently wreaks havoc in Gadget as it doesn't know what to do with the NA.

To deal with this it would probably be best to issue a warning to the user (preferably in all caps :) ) when this occurs so the user can take appropriate action (e.g. na.omit or overwrite the sd column with a default value) before writing it to the likelihood file.

Is this better filed as an rgadget problem, i.e. we should he throwing an outright error that the output files are malformed? Presumably gadget won't cope with NA in any numeric column, not just stddev.

Of course, we can do both.

bthe commented

When you mention it I think it would be worthwhile to do both, as mfdb might see increased with other modeling platforms.

Okay, I've added gadget-framework/rgadget@8b52804 which seems simple and decisive. Obviously we can back it out and be more gentle if it's introducing too many errors (but they'd be errors that ought be fixed). This should mean you don't lose that half-hour again :)

As for the MFDB half, I'm still not convinced. NA values are the "correct" thing to do in R, and something anything downstream of MFDB should handle properly (or at least error nicely). Assuming it does, the warning will just be extra noise on top of an error.

If you just fetch some data, adding a warning kinda feels like we're teaching the user to suck eggs---"this is a data.frame, it has NAs in it, are you sure you know what you're doing?".

bthe commented

Thanks, this sounds reasonable:)