convert data.frame to FLStock
GeoTuxMan opened this issue · 24 comments
Hi,
I try to do a stock assement with sca() method.
After I convert my data.frame to FLStock object :
mystock<-as.FLStock(land_t)
the command:
summary(mystock)
give me only values of one (1).
My all code:
`#stock assesment
library(FLCore)
library(ggplotFL)
library(FLa4a)
library(xlsx)
library(plotly)
dat=read.xlsx("Sprot_FLR.xlsx", sheetIndex = 1)
dat<-data.frame(dat$slot,dat$age, dat$year, dat$data, dat$units)
dat2<-na.omit(dat)
select only landings in tone
land_t <- na.omit(subset(dat, dat2$dat.slot=="landings", select=-dat.slot))
colnames(land_t)<- c("age","year","data","units")
landst <- as.FLQuant(land_t)
summary(landst)
plot(landst)
plot1=data.frame(land_t)
#qplot(x=year, y=data, data=plot1, color=factor(age),geom=c("point","line"),ylab="Catch (t)", xlab="")
p1<-ggplot(plot1, aes(x=year, y=data, group = age)) +
geom_line(aes(colour=as.factor(age))) + ylab("Catch (t)") + xlab("") +
theme(axis.text.x = element_text(angle = 90, hjust = 1))
ggplotly(p1)
select only landings in numbers
land_n <- na.omit(subset(dat, dat2$dat.slot=="landings.n", select=-dat.slot))
colnames(land_n)<- c("age","year","data","units")
landsn <- as.FLQuant(land_n)
summary(landsn)
plot(landsn)
plot2=data.frame(land_n)
#qplot(x=year, y=data, data=plot2, color=factor(age),geom=c("point","line"),ylab="Catch (1000^3)", xlab="")
p2<-ggplot(plot2, aes(x=year, y=data, group = age)) +
geom_line(aes(colour=as.factor(age))) + ylab("Catch (1000^3)") + xlab("") +
theme(axis.text.x = element_text(angle = 90, hjust = 1))
ggplotly(p2)
convert dataframe to FLStock
mystock<-as.FLStock(land_t)
summary(mystock) # empty ?!
add an index of abundance : cpue
#the statistical catch-at-age model
#fit <- sca(mystock, cpue)
update FLStock object
#stk<-mystock+fit
#plot(stk)
plot fitted vs observed catch-at-age
#plot(fit, ple4)
`
Sprot_FLR.xlsx
I look forward to some tips ...
Cheers !
The table in the XLSX file has data labelled as "landings" and "landings.n". But landings in an FLStock refers to total landings in weight, and not by age. We usually have landings.n, in number, and landings.wt, the mean weight at age in the landings, usually in kg.
From simply the landings.n data, you could create an FLStock using
library(FLCore)
library(xlsx)
dat <- na.omit(read.xlsx("Sprot_FLR.xlsx", sheetIndex = 1, stringsAsFactors=FALSE))
stk <- as(dat[dat$slot == "landings.n",], "FLStock")
If you add other data to the same table, you can add it to the FLStock using the same code.
Maybe take a look at the example dataset for the class, data(ple4)
to see what needs to go into each slot
Also, to use FLa4a::sca() you need to provide at least:
- catch.n, usually a sum of landings.n and discards.n
- catch.wt
- m
- mat
- stock.wt
- harvest.spwn and m.spwn
And one or more indices of abundance, as objects of class FLIndex or FLIndices
You should do
discards.n(stk) <- 0
discards.wt(stk) <- landings.wt(stk)
catch(stk) <- computeCatch(stk, "all")
or simply load the data into the catch.n slot. And then add all the other biological information.
Is this a biomass index? Or is it by age?
Generally, take a look at
http://www.flr-project.org/doc/Loading_your_data_into_FLR.html
for how to load data into FLR.
What are the dimensions of the result of as.matrix(indices)
?
dim(as.matrix(indices))
should be c(5, 27). First dimension is age, second year.
With regards the data, a biomass index won't have enough information to fit sca(). The method assumes an index at age is available, covering at least some of those ages, like from a survey.
It all depends on what information you have. sca() does better when the index of abundance has information on the cohorts' changes in abundance, say from survey at age. If you only have a biomass index, like commercial CPUE, then
- Pass to sca an object of class
FLIndexBiomass
, created usingFLIndexBiomass(index=FLQuant())
where thatFLQuant
was created as above. - You might need to set the n1model, the initial abundances at age, as sca will have little information to estimate it.
The a4a model is built around a series of submodels, see http://www.flr-project.org/doc/Statistical_catch_at_age_models_in_FLa4a.html foir some explanation and examples.
A call top sca() builds a default model according to the dimensions of the data provided, but it is never fully appropriate. You will need to look at each of the submodels and constrcut those that are right for your data. For example, trhe default srmodel is a factor per year, but you could assume a given SRR, say bevholt.
We are releasing a new version of FLa4a this week, and I will compile a 32bit version. The current one is a bit outdated.
The message about the Hessian indicates that the model did not fit properly. So I assume the fit object has not updated the stock as required by the plot method. Take a look at the residuals and the results of the fit.
If the model has not fit properly, there will be no residuals. Look at the convergence row of fitSumm(fit)
to see the convergence flag returned by ADMB. Was a log-likelihood value returned (logLik(fit)
)?
If it does not fit you will need to play with the model. Maybe try to make it simpler, for example using splines rather than factors.
That's it, the model is not fitting. This is really a modelling issue rather than a software one, but you need to think carefully about the submodels. The defaults tend to work when data series are long and fairly informative, like the ple4 example in the tutorials. For short series, or when the catch and CPUE data are in disagreement, there is work to be done to try to find a suitable model.
OK. The model should be able to deal with the CPUE being shorter, but it is usually at the beginning of the series. Having catch but no index between 2008 and 2014 is a problem.