The goal of smarterapi
is to collect data from SMARTER REST
API and provide them
to the user as a dataframe. Get more information with the online
vignette.
The smarterapi
package is only available from
GitHub and can be installed as a source package
or in alternative using
devtools package, for
example:
# install devtools if needed
# install.packages("devtools")
devtools::install_github("cnr-ibba/r-smarter-api")
After the installation, you can load the package in your R session:
# import this library to deal with the SMARTER API
library(smarterapi)
After the public release of SMARTER data, there’s no need to provide
credentials to access the data. If you used to have credentials to
access the data, you need to install the latest version of smarterapi
package.
smarterapi
provides a set of functions used to fetch data from
SMARTER-backend endpoints described in api
documentation and
returning them into a data.frame
object. For example, the
get_smarter_datasets
lets to query the Datasets
endpoint,
while get_smarter_samples
is able to query and parse the Sample
endpoints
response. Each smarterapi
function is documented in R
and you can
get help on each function like any other R functions. There are two
types of parameters that are required to fetch data through the
SMARTER-backend: the species
parameter, which can be Goat
or
Sheep
respectively for goat and sheep
Samples or
Variants
endpoints, and the query
parameter which can be provided to any
get_smarter_ function. The species
parameter is mandatory in
order to query an endpoint specific for Goat or Sheep, while the
query
parameter is optional and need to be specified as a list()
object in order to limit your query to some data in particular. For
example, if you need all the foreground genotypes datasets, you can
collect data like this:
datasets <- get_smarter_datasets(
query = list(type = "genotypes", type = "foreground"))
while if you require only background goat samples, you can do like this:
goat_samples <- get_smarter_samples(
species = "Goat", query = list(type = "background"))
The full option list available to each endpoint is available on the SMARTER-backend swagger documentation: the option name to use is the same name described in parameters, and description and parameter types can give you hints on how to exploit endpoint properly. For instance, parameters described as array of objects can be specified multiple times:
> goat_breeds <- get_smarter_breeds(query = list(species="Goat", search="land"))
> goat_breeds[c("name","code")]
name code
1 Rangeland RAN
2 Landrace LNR
3 Landin LND
4 Icelandic goat ICL
> goat_samples <- get_smarter_samples(
species = "Goat",
query = list(
breed_code="RAN",
breed_code="LNR",
breed_code="LND",
breed_code="ICL"
)
)
This is a basic example which shows you how to collect data from SMARTER REST API for the Italian goats belonging to the Adaptmap dataset:
# collect the dataset by providing part of the name with the search option:
# since we are forcing to collect only background genotypes, only one dataset
# will be returned
datasets <- get_smarter_datasets(
query = list(
species = "Goat",
search = "adaptmap",
type = "genotypes",
type = "background"
)
)
# get the dataset id
adatpmap_id <- datasets["_id.$oid"][1]
# collect all the italian goats from the adaptmap dataset. Using the dataset id
# to filter out all the samples belonging to this dataset and the country option
# to filter out only the italian samples for this dataset
adaptmap_goats <- get_smarter_samples(
species = "Goat",
query = list(
dataset = adatpmap_id,
country = "Italy"
)
)
We have other examples in the vignettes, for example how to collect data from the Variants endpoints or how to work with geographic coordinates.