flr/FLa4a

convert data.frame to FLStock

GeoTuxMan opened this issue · 24 comments

Hi,

I try to do a stock assement with sca() method.
After I convert my data.frame to FLStock object :
mystock<-as.FLStock(land_t)
the command:
summary(mystock)
give me only values of one (1).

My all code:
`#stock assesment

library(FLCore)
library(ggplotFL)
library(FLa4a)
library(xlsx)
library(plotly)

dat=read.xlsx("Sprot_FLR.xlsx", sheetIndex = 1)

dat<-data.frame(dat$slot,dat$age, dat$year, dat$data, dat$units)
dat2<-na.omit(dat)

select only landings in tone

land_t <- na.omit(subset(dat, dat2$dat.slot=="landings", select=-dat.slot))
colnames(land_t)<- c("age","year","data","units")
landst <- as.FLQuant(land_t)
summary(landst)
plot(landst)

plot1=data.frame(land_t)

#qplot(x=year, y=data, data=plot1, color=factor(age),geom=c("point","line"),ylab="Catch (t)", xlab="")
p1<-ggplot(plot1, aes(x=year, y=data, group = age)) +
geom_line(aes(colour=as.factor(age))) + ylab("Catch (t)") + xlab("") +
theme(axis.text.x = element_text(angle = 90, hjust = 1))
ggplotly(p1)

select only landings in numbers

land_n <- na.omit(subset(dat, dat2$dat.slot=="landings.n", select=-dat.slot))
colnames(land_n)<- c("age","year","data","units")
landsn <- as.FLQuant(land_n)
summary(landsn)
plot(landsn)
plot2=data.frame(land_n)

#qplot(x=year, y=data, data=plot2, color=factor(age),geom=c("point","line"),ylab="Catch (1000^3)", xlab="")
p2<-ggplot(plot2, aes(x=year, y=data, group = age)) +
geom_line(aes(colour=as.factor(age))) + ylab("Catch (1000^3)") + xlab("") +
theme(axis.text.x = element_text(angle = 90, hjust = 1))
ggplotly(p2)

convert dataframe to FLStock

mystock<-as.FLStock(land_t)
summary(mystock) # empty ?!

add an index of abundance : cpue

#the statistical catch-at-age model
#fit <- sca(mystock, cpue)

update FLStock object

#stk<-mystock+fit
#plot(stk)

plot fitted vs observed catch-at-age

#plot(fit, ple4)

`
Sprot_FLR.xlsx
I look forward to some tips ...
Cheers !

The table in the XLSX file has data labelled as "landings" and "landings.n". But landings in an FLStock refers to total landings in weight, and not by age. We usually have landings.n, in number, and landings.wt, the mean weight at age in the landings, usually in kg.

From simply the landings.n data, you could create an FLStock using

library(FLCore)
library(xlsx)

dat <- na.omit(read.xlsx("Sprot_FLR.xlsx", sheetIndex = 1, stringsAsFactors=FALSE))
stk <- as(dat[dat$slot == "landings.n",], "FLStock")

If you add other data to the same table, you can add it to the FLStock using the same code.

Maybe take a look at the example dataset for the class, data(ple4) to see what needs to go into each slot

Also, to use FLa4a::sca() you need to provide at least:

  • catch.n, usually a sum of landings.n and discards.n
  • catch.wt
  • m
  • mat
  • stock.wt
  • harvest.spwn and m.spwn

And one or more indices of abundance, as objects of class FLIndex or FLIndices

You should do

discards.n(stk) <- 0
discards.wt(stk) <- landings.wt(stk)
catch(stk) <- computeCatch(stk, "all")

or simply load the data into the catch.n slot. And then add all the other biological information.

Is this a biomass index? Or is it by age?

Generally, take a look at

http://www.flr-project.org/doc/Loading_your_data_into_FLR.html

for how to load data into FLR.

What are the dimensions of the result of as.matrix(indices)?

dim(as.matrix(indices))

should be c(5, 27). First dimension is age, second year.

With regards the data, a biomass index won't have enough information to fit sca(). The method assumes an index at age is available, covering at least some of those ages, like from a survey.

It all depends on what information you have. sca() does better when the index of abundance has information on the cohorts' changes in abundance, say from survey at age. If you only have a biomass index, like commercial CPUE, then

  • Pass to sca an object of class FLIndexBiomass, created using FLIndexBiomass(index=FLQuant()) where that FLQuant was created as above.
  • You might need to set the n1model, the initial abundances at age, as sca will have little information to estimate it.

The a4a model is built around a series of submodels, see http://www.flr-project.org/doc/Statistical_catch_at_age_models_in_FLa4a.html foir some explanation and examples.

A call top sca() builds a default model according to the dimensions of the data provided, but it is never fully appropriate. You will need to look at each of the submodels and constrcut those that are right for your data. For example, trhe default srmodel is a factor per year, but you could assume a given SRR, say bevholt.

We are releasing a new version of FLa4a this week, and I will compile a 32bit version. The current one is a bit outdated.

The message about the Hessian indicates that the model did not fit properly. So I assume the fit object has not updated the stock as required by the plot method. Take a look at the residuals and the results of the fit.

If the model has not fit properly, there will be no residuals. Look at the convergence row of fitSumm(fit) to see the convergence flag returned by ADMB. Was a log-likelihood value returned (logLik(fit))?

If it does not fit you will need to play with the model. Maybe try to make it simpler, for example using splines rather than factors.

That's it, the model is not fitting. This is really a modelling issue rather than a software one, but you need to think carefully about the submodels. The defaults tend to work when data series are long and fairly informative, like the ple4 example in the tutorials. For short series, or when the catch and CPUE data are in disagreement, there is work to be done to try to find a suitable model.

OK. The model should be able to deal with the CPUE being shorter, but it is usually at the beginning of the series. Having catch but no index between 2008 and 2014 is a problem.