Survey indices with cv's
Opened this issue · 4 comments
When importing survey indices you would like to add information on uncertainty along with the survey index, e.g when an index value is supplied one would like also to have the cv (i.e. std. dev/mean). What is the best way of adding that information to mfdb? Should I import the CV as an additional time-series?
I think is fairly obvious I've never used this data type :) But continuing with the question, I assume one would deal with indices of biomass by species in a similar fashion?
Should I import the CV as an additional time-series?
Yes, but having more "first-class" support for CV with an extra column to store it / support in the querying functions would make sense I think.
indices of biomass by species
These could be stored in sample.weight
with count
being NA
/ NULL
. Then you have species and all the other metadata fields handy.
Yes, but having more "first-class" support for CV with an extra column to store it / support in the querying functions would make sense I think.
Sounds good.
These could be stored in sample.weight with count being NA / NULL. Then you have species and all the other metadata fields handy.
Yes this is exactly what I have leaned towards in the past. To give you a bit of background for this question, I'm thinking about abundance estimates that arise from sighting surveys, that are only available by division and species. Other attributes are not available. So a typical dataset looks like:
year division count cv species
2005 WG 10792 0.59 MIW
2007 WG 9853 0.43 MIW
2015 WG 5241 0.49 MIW
2007 WC 20741 0.3 MIW
2007 CIP 1350 0.38 MIW
...
I've been picking an areacell at random from the division and assigning the abundance estimates to that. I can sort of squeeze this information into both table but in both cases you will need be careful when querying the data.
I think this is why we made count NULLable in the first place. Which options makes more sense I'm not sure
One of the reasons that survey index exists is so it could be applied as an abundance scaling factor to other queries, in which case you choose them by the name you gave them. I'm not sure if that join makes sense if we add a species column in as well.
I've been picking an areacell at random from the division and assigning the abundance estimates to that.
I think I'd add an areacell with the same name as the division to the division (if that makes sense).