Detection of diversity of coastal avian and mammalian fauna

Oleksii Dubovyk, Ella DiPetto, Chi Wei, Iroshmal Peiris, Eric L. Walters

Data and analyses for the VAS 2024 annual meeting presentation (and subsequent pubs, hopefully) on observed avian and mammalian diversity of shorelines in coastal Virginia. Data privided by Ella DiPetto.

Data source

Wildlife observations

The data collected during the dissertation research of Ella DiPetto.

detections.csv

The main dataset containing the data on wildlife observations - request from Ella (gdipe001@odu.edu).

deployments.csv

Supplementary dataset on duration of each deployment of trail cameras.

Functional traits

BirdFuncDat.txt and MamFuncDat.txt

The dataset used was EltonTraits 1.0¹.

Tides

tides.csv

The data on tides from the NOAA.

main.R

The main script, all analyses are here.

suntime.R

A function to find a local time of sunrise and sunset for a given date and coordinates. Based on the USGS calculator.

Arguments

date - chr, date of interest in "yyyy-mm-dd" format
lat - num, decimal latitude
lon - num, decimal longitude
utc_offset - num, time zone offset relative to the UTC: e.g., EDT is -4, EST is -5, PST is -8.

Usage

To calculate the sunrise and sunset time for Norfolk on Apr 2nd 2024, we call

suntime(date = "2024-04-02", lat = 36.8794, lon = -76.2892, utc_offset = -4)
## [1] "06:48:22" "19:28:38"

bigrarefaction.R

A group of functions to build rarefaction curves through interpolation or extrapolation procedures. interpolation(..., mode = "l") is built to handle large numbers.

It is assumed that for a community with $S$ species and $N$ individuals, such that there are $N_i$ individuals of species $i$ and, therefore, $\sum \limits_{i=1}^{S} N_i = N$, when $n$ individuals are drawn, the interpolated species richness ($S(x)$ represents the species richness observed when $x$ individuals are drawn) can be estimated as:

$S(n) = S(N) - {\binom{N}{n}}^{-1} \times \sum \limits_{i=1}^{S(N)} \binom{N-N_i}{n}$

and extrapolated values are estimated through Chao1 estimator², $\hat{f_0} = f_1^2 / 2f_2$, where $f_x$ represents the number of species for which $N_i = x$,

$S(N+m) = S(N) + \hat{f_0}\left[ 1 - \left( 1 - \frac{f_1}{N \hat{f_0} + f_1} \right)^m \right]$.

probrar.R

All questions regarding this section should be addressed to Oleksii, oadubovyk@gmail.com

Desperate attempts to go away from the singleton/doubleton-based Chao² approximations of extrapolated rarefaction curves. Use at your own risk: the approach has not been peer reviewed and mostly relies on thoughts and prayers.

The notation is the following: N denotes a vector of values representing abundances of different species within a community.

prob_same(N, m) estimates the expected probability of getting an unobserved before species in a sample drawn from community N at mth individual with replacement (therefore, inaccurate). If we let $S$ be the observed species richness, $J$ -- overall number of individuals, and $p_i = \frac{N_i}{J}$ -- percentages of species within a community, then the probability of the $m$th individual drawn to represent a new species is

$P(m) = \sum \limits_{i = 1}^{S} \prod \limits_{k = 1}^m \frac{N_i - p_i (k-1)}{J - (k-1)}$

Again, this estimation is inaccurate.

probs_roll(N) estimates the expected probabilities when drawing a sequence of individuals from a community simply applying prob_same(N, m) to the sequence of individuals $m = {1, 2, 3, \dotsb, N-2, N-1, N}$.

probs_combin(n, k = 1) uses combinatorics to estimate said probabilities. I honestly don't remember how I came up with that four years ago, I might have looked it up in some of the Chao papers, all I remember is that it is somehow related to the hypergeometric distribution. If an mth individual is drawn from a community, then the probability of this individual representing a new species not previously observed with ${1, \dotsb, m-1}$ individuals is

$\sum \limits_{i = 1}^{S} - \frac{\binom{N - N_i}{m}}{\binom{N}{m}} + \frac{\binom{N - N_i}{m-1}}{\binom{N}{m-1}}$

The k parameter specifies a step size with the number of individuals: the formula can be elaborated to estimate the probability of getting a new species when $k$ $m$th indiviuals are drawn.

beyond_combin(N, ceil, burnin) tries to estimate how would the sequence of probs_combin(N) behave once $m$ reaches the $N$ and $\binom{N}{m-1}$ starts throwing out undivisible zeros. It is a very brute approach where I try to predict how the probabilities are decreasing on each step of the sequence of probs_roll(N) (ignoring some first burnin values and all the way until ceil). We can then use beyond_combin(N, ceil, burnin) %>% sum() to estinmate the species richness including the unobserved part without relying on singleton-based Chao1 estimator (hint: the absence of singletons still creates problems since the last values of the probs_combin() are zeros; zeros are ignored in beyond_combin()).

aictoolbox.R

Useful functions to compare models within the AIC framework.

metaAIC(model) extracts some AIC information from a model and returns a named vector: formula - model formula, AIC - AIC value, logLik - log-likelihood, df - degrees of freedom.

rankAIC(list(model1, model2, ...)) returns a list of input models ranked by $\delta$AIC with the following columns: model - model formula, AIC - AIC value, delta_AIC - $\delta$AIC, weight - AIC weight, logLik - log-likelihood, df - degrees of freedom.

Prerequisites

R stuff

The latest R version
Posit/RStudio
Install tidyverse, lubridate, data.table, caret:

packages <- c("tidyverse", "lubridate", "data.table", "caret")
installed_packages <- packages %in% rownames(installed.packages())
if (any(installed_packages == FALSE)) {
  install.packages(packages[!installed_packages])
}
invisible(lapply(packages, library, character.only = TRUE))

Git Bash (if you only want to get the newest code)

Install Git Bash
Open the folder you want our repository to be copied to, e.g.,
```
cd /c/Users/username/CoastalDiversity
```

Let the Git know who you are

git config --global user.name "Your Name"
git config --global user.email "youremail@domain.com"

Type

git clone https://github.com/OleksiiDubovyk/CoastalDiversity

Whenever you want to get the newest code, type
```
git pull
```

Setup your GitHub account (if you plan to contribute to coding)

Create a GitHub account
Install GitHub CLI
- Open Windows PowerShell and run
```
winget install --id GitHub.cli
```
- Restart Git Bash, navigate to the working directory, an run
```
gh auth login
```
- Follow the prompts: GitHub.com -> HTTPS -> Y -> Login with a web browser

Open Git Bash, run

git remote set-url origin https://{TOKEN}@github.com/OleksiiDubovyk/CoastalDiversity.git/

Whenever you want to edit the code, type

git pull
git add filename.extension # specify the file you have just changed
git commit -m "Your comments on what you've added"
git push

References

Wilman, H., J. Belmaker, J. Simpson, C. de la Rosa, M. M. Rivadeneira, and W. Jetz. 2014. EltonTraits 1.0: species-level foraging attributes of the world’s birds and mammals. Ecology 95:2027–2027. https://doi.org/10.1890/13-1917.1 ↩
Chao, A., N. J. Gotelli, T. C. Hsieh, E. L. Sander, K. H. Ma, R. K. Colwell, and A. M. Ellison. 2014. Rarefaction and extrapolation with Hill numbers: a framework for sampling and estimation in species diversity studies. Ecological Monographs 84:45–67. https://doi.org/10.1890/13-0133.1 ↩ ↩²

OleksiiDubovyk/CoastalDiversity