/rules_r

R rules for Bazel

Primary LanguagePythonApache License 2.0Apache-2.0

R Rules for Bazel Build Status

Rules

Workspace Rules

Convenience Macros

Overview

These rules are used for building R packages with Bazel. Although R has an excellent package management system, there is no continuous build and integration system for entire R package repositories. An advantage of using Bazel, over a custom solution of tracking the package dependency graph and triggering builds accordingly on each commit, is that R packages can be built and tested as part of one build system in multi-language monorepos.

Getting started

The following assumes that you are familiar with how to use Bazel in general.

In order to use the rules, you must have bazel 0.10.0 or later and add the following to your WORKSPACE file:

# Change master to the git tag you want.
http_archive(
    name = "com_grail_rules_r",
    strip_prefix = "rules_r-master",
    urls = ["https://github.com/grailbio/rules_r/archive/master.tar.gz"],
)

load("@com_grail_rules_r//R:dependencies.bzl", "r_rules_dependencies")

r_rules_dependencies()

You can load the rules in your BUILD file like so:

load("@com_grail_rules_r//R:defs.bzl",
     "r_pkg", "r_library", "r_unit_test", "r_pkg_test")

There are also convenience macros available which create multiple implicit targets:

load("@com_grail_rules_r//R:defs.bzl", "r_package")

and

load("@com_grail_rules_r//R:defs.bzl", "r_package_with_test")

Configuration

These rules assume that you have R installed on your system (we recommend 3.4.3 or above), and can be located using the PATH environment variable.

For each package, you can also specify a different Makevars file that can be used to have finer control over native code compilation. For macOS, the Makevars file used as default helps find gfortran. To change the defaults for your repository, you can provide arguments makevars_darwin and/or makevars_linux to r_rules_dependencies.

For macOS, this setup will help you cover the requirements for a large number of packages:

brew install gcc pkg-config icu4c openssl

For Ubuntu, this (or equivalent for other Unix systems) helps:

apt-get install pkgconf libssl-dev libxml2-dev libcurl4-openssl-dev

Note

For no interference from other packages during the build (possibly other versions installed manually by the user), it is recommended that packages other than those with recommended priority be installed in the directory pointed to by R_LIBS_USER. The Bazel build process will then be able to hide all the other packages from R by setting a different value for R_LIBS_USER.

When moving to Bazel for installing R packages on your system, we recommend cleaning up existing machines:

Rscript \
  -e 'options("repos"="https://cloud.r-project.org")' \
  -e 'non_base_pkgs <- installed.packages(priority=c("recommended", "NA"))[, "Package"]' \
  -e 'remove.packages(non_base_pkgs, lib=.Library)'

# If not set up already, create the directory for R_LIBS_USER.
Rscript \
  -e 'dir.create(Sys.getenv("R_LIBS_USER"), recursive=TRUE, showWarnings=FALSE)'

For more details on how R searches different paths for packages, see libPaths.

External packages

To depend on external packages from CRAN and other remote repos, you can define the packages as a CSV with three columns -- Package, Version, and sha256. Then use repository_list rule to define R repositories for each package. For packages not in a CRAN like repo (e.g. github), you can use r_repository rule directly. For packages on your local system but outside your main repository, you will have to use local_repository with a saved BUILD file. Same for VCS repositories.

load("@com_grail_rules_r//R:repositories.bzl", "r_repository", "r_repository_list")

# R packages with non-standard sources.
r_repository(
    name = "R_plotly",
    sha256 = "24c848fa2cbb6aed6a59fa94f8c9b917de5b777d14919268e88bff6c4562ed29",
    strip_prefix = "plotly-a60510e4bbce5c6bed34ef6439d7a48cb54cad0a",
    urls = [
        "https://github.com/ropensci/plotly/archive/a60510e4bbce5c6bed34ef6439d7a48cb54cad0a.tar.gz",
    ],
)

# R packages with standard sources.
r_repository_list(
    name = "r_repositories_bzl",
    build_file_overrides = "@myrepo//third-party/R:build_file_overrides.csv",
    package_list = "@myrepo//third-party/R:packages.csv",
    remote_repos = {
        "BioCsoft": "https://bioconductor.org/packages/3.6/bioc",
        "BioCann": "https://bioconductor.org/packages/3.6/data/annotation",
        "BioCexp": "https://bioconductor.org/packages/3.6/data/experiment",
        "CRAN": "https://cloud.r-project.org",
    },
)

load("@r_repositories_bzl//:r_repositories.bzl", "r_repositories")

r_repositories()

The list of all external R packages configured this way can be obtained from your shell with

$ bazel query 'filter(":R_", //external:*)'

NOTE: Periods ('.') in the package names are replaced with underscores ('_') because bazel does not allow periods in repository names.

Examples

Some examples are available in the tests directory of this repo.

Also see Razel scripts that provide utility functions to generate BUILD files and WORKSPACE rules for external packages.

Docker

See container support.

r_pkg

r_pkg(srcs, pkg_name, deps, cc_deps, build_args, install_args, config_override, roclets,
      makevars_user, env_vars, tools, build_tools)

Rule to install the package and its transitive dependencies in the Bazel sandbox, so it can be depended upon by other package builds.

Implicit output targets
name.bin.tar.gz Binary archive of the package.
name.tar.gz Source archive of the package.
Attributes
srcs

List of files, required

Source files to be included for building the package.

pkg_name

String; optional

Name of the package if different from the target name.

deps

List of labels; optional

R package dependencies of type `r_pkg`.

cc_deps

List of labels; optional

cc_library dependencies for this package.

build_args

List of strings; default ["--no-build-vignettes", "--no-manual"]

Additional arguments to supply to R CMD build.

install_args

List of strings; optional

Additional arguments to supply to R CMD INSTALL.

config_override

File; optional

Replace the package configure script with this file.

roclets

List of strings; optional

roclets to run before installing the package. If this is non-empty, then you must specify roclets_deps as the R package you want to use for running roclets. The runtime code will check if devtools is available and use `devtools::document`, failing which, it will check if roxygen2 is available and use `roxygen2::roxygenize`.

roclets_deps

List of labels; optional

roxygen2 or devtools dependency for running roclets.

makevars_user

File; default to @com_grail_rules_r_makevars//:Makevars

User level Makevars file.

env_vars

Dictionary; optional

Extra environment variables to define for building the package.

tools

List of labels; optional

Executables that code in this package will try to find in the system.

build_tools

List of labels; optional

Executables that native code compilation will try to find in the system.

r_library

r_library(pkgs, library_path)

Executable rule to install the given packages and all dependencies to a user provided or system default R library. Run the target with --help for usage information.

The rule used to provide a tar archive of the library as an implicit output. That feature is now it's own rule -- r_library_tar. See documentation for r_library_tar rule and example usage for container_image rule.

Attributes
pkgs

List of labels, required

Package (and dependencies) to install.

library_path

String; optional

If different from system default, default library location for installation. For runtime overrides, use bazel run [target] -- -l [path].

r_unit_test

r_unit_test(pkg, suggested_deps)

Rule to keep all deps in the sandbox, and run the provided R test scripts.

Attributes
pkg

Label; required

R package (of type r_pkg) to test.

suggested_deps

List of labels; optional

R package dependencies of type `r_pkg`.

env_vars

Dictionary; optional

Extra environment variables to define before running the test.

tools

List of labels; optional

Executables to be made available to the test.

r_pkg_test

r_pkg_test(pkg, suggested_deps, check_args)

Rule to keep all deps of the package in the sandbox, build a source archive of this package, and run R CMD check on the package source archive in the sandbox.

Attributes
pkg

Label; required

R package (of type r_pkg) to test.

suggested_deps

List of labels; optional

R package dependencies of type `r_pkg`.

check_args

List of strings; default ["--no-build-vignettes, "--no-manual"]

Additional arguments to supply to R CMD check.

env_vars

Dictionary; optional

Extra environment variables to define before running the test.

tools

List of labels; optional

Executables to be made available to the test.

r_binary

r_binary(name, srcs, deps, data, env_vars, tools, rscript_args)

Build a wrapper shell script for running an executable which will have all the specified R packages available.

The target can be executed standalone, with bazel run, or called from other executables if RUNFILES_DIR is exported in the environment with the runfiles of the root executable.

Attributes
src

File; required

An Rscript interpreted file, or file with executable permissions.

deps

List of labels; optional

Dependencies of type r_binary, r_pkg, or r_library.

data

List of labels; optional

Files needed by this rule at runtime.

env_vars

Dictionary; optional

Extra environment variables to define before running the binary.

tools

List of labels; optional

Executables to be made available to the binary.

rscript_args

String; optional

If src file does not have executable permissions, arguments for the Rscript interpreter. We recommend using the shebang line and giving your script execute permissions instead of using this.

r_test

r_test(name, srcs, deps, data, env_vars, tools, rscript_args)

This is idential to r_binary but is run as a test.

r_repository

r_repository(urls, strip_prefix, type, sha256, build_file)

Repository rule in place of new_http_archive that can run razel to generate the BUILD file automatically.

Attributes
urls

List of strings; required

URLs from which the package source archive can be fetched.

strip_prefix

String; optional

The prefix to strip from all file paths in the archive.

type

String; optional

Type of the archive file (zip, tgz, etc.).

sha256

String; optional

sha256 checksum of the archive to verify.

build_file

File; optional

Optional BUILD file for this repo. If not provided, one will be generated.

razel_args

Dictionary; optional

Other arguments to supply to buildify function in razel.

r_repository_list

r_repository_list(package_list, build_file_overrides, remote_repos, other_args)

Repository rule that will generate a bzl file containing a macro, to be called as r_repositories(), for r_repository definitions for packages in package_list CSV.

Attributes
package_list

File; required

CSV containing packages with name, version and sha256; with a header.

build_file_overrides

File; optional

CSV containing package name and BUILD file path; with a header.

remote_repos

Dictionary; optional

Repos to use for fetching the archives.

other_args

Dictionary; optional

Other arguments to supply to generateWorkspaceMacro function in razel.

r_package

r_package(pkg_name, pkg_srcs, pkg_deps, pkg_suggested_deps=[])

Convenience macro to generate the r_pkg and r_library targets.

r_package_with_test

r_package_with_test(pkg_name, pkg_srcs, pkg_deps, pkg_suggested_deps=[], test_timeout="short")

Convenience macro to generate the r_pkg, r_library, r_unit_test, and r_pkg_test targets.

Contributing

Contributions are most welcome. Please submit a pull request giving the owners of this github repo access to your branch for minor style related edits, etc.

Known Issues

Please check open issues at the github repo.

We have tested only on macOS and Ubuntu (VM and Docker).