/r-smarter-api

R code to query SMARTER-backend

Primary LanguageRGNU General Public License v3.0GPL-3.0

smarterapi

R-CMD-check lint pkgdown

The goal of smarterapi is to collect data from SMARTER REST API and provide them to the user as a dataframe. Get more information with the online vignette.

Installation

The smarterapi package is only available from GitHub and can be installed as a source package or in alternative using devtools package, for example:

# install devtools if needed
# install.packages("devtools")
devtools::install_github("cnr-ibba/r-smarter-api")

After the installation, you can load the package in your R session:

# import this library to deal with the SMARTER API
library(smarterapi)

SMARTER credentials

After the public release of SMARTER data, there’s no need to provide credentials to access the data. If you used to have credentials to access the data, you need to install the latest version of smarterapi package.

Querying the SMARTER backend

smarterapi provides a set of functions used to fetch data from SMARTER-backend endpoints described in api documentation and returning them into a data.frame object. For example, the get_smarter_datasets lets to query the Datasets endpoint, while get_smarter_samples is able to query and parse the Sample endpoints response. Each smarterapi function is documented in R and you can get help on each function like any other R functions. There are two types of parameters that are required to fetch data through the SMARTER-backend: the species parameter, which can be Goat or Sheep respectively for goat and sheep Samples or Variants endpoints, and the query parameter which can be provided to any get_smarter_ function. The species parameter is mandatory in order to query an endpoint specific for Goat or Sheep, while the query parameter is optional and need to be specified as a list() object in order to limit your query to some data in particular. For example, if you need all the foreground genotypes datasets, you can collect data like this:

datasets <- get_smarter_datasets(
  query = list(type = "genotypes", type = "foreground"))

while if you require only background goat samples, you can do like this:

goat_samples <- get_smarter_samples(
  species = "Goat", query = list(type = "background"))

The full option list available to each endpoint is available on the SMARTER-backend swagger documentation: the option name to use is the same name described in parameters, and description and parameter types can give you hints on how to exploit endpoint properly. For instance, parameters described as array of objects can be specified multiple times:

> goat_breeds <- get_smarter_breeds(query = list(species="Goat", search="land"))
> goat_breeds[c("name","code")]
            name code
1      Rangeland  RAN
2       Landrace  LNR
3         Landin  LND
4 Icelandic goat  ICL

> goat_samples <- get_smarter_samples(
    species = "Goat", 
    query = list(
      breed_code="RAN", 
      breed_code="LNR", 
      breed_code="LND", 
      breed_code="ICL"
    )
  )

Examples

This is a basic example which shows you how to collect data from SMARTER REST API for the Italian goats belonging to the Adaptmap dataset:

# collect the dataset by providing part of the name with the search option:
# since we are forcing to collect only background genotypes, only one dataset
# will be returned
datasets <- get_smarter_datasets(
  query = list(
    species = "Goat", 
    search = "adaptmap", 
    type = "genotypes", 
    type = "background"
    )
)

# get the dataset id
adatpmap_id <- datasets["_id.$oid"][1]

# collect all the italian goats from the adaptmap dataset. Using the dataset id
# to filter out all the samples belonging to this dataset and the country option
# to filter out only the italian samples for this dataset
adaptmap_goats <- get_smarter_samples(
  species = "Goat", 
  query = list(
    dataset = adatpmap_id, 
    country = "Italy"
  )
)

We have other examples in the vignettes, for example how to collect data from the Variants endpoints or how to work with geographic coordinates.