/RcppR6

Code-generation wrapping C++ classes as R6 classes

Primary LanguageROtherNOASSERTION

RcppR6

Build Status

What is this thing?

This package aims to provide a simple way of generating boilerplate code for exposing C++ classes to R. It is similar in many ways to Rcpp "modules".

There are #Documentation explaining the idea more fully, but here is the basic idea. Suppose we have a class like this

class circle {
public:
  double radius;
  circle(double r) : radius(r) {}
  double area() const {
    return M_PI * radius * radius;
  }
  double circumference() const {
    return M_PI * 2 * radius;
  }
  void set_circumference(double c) {
    if (c < 0) {
      Rcpp::stop("Circumference must be positive");
    }
    radius = c / (2 * M_PI);
  }
};

This simple class represents a circle, and has one data member (radius), and methods to compute the area and circumference. The method set_circumference sets the radius so that it gives the required circumference. (Yes, this is very silly. This would also be trivial to write using R6 or reference classes directly, but perhaps this is needed as some part of a larger set of C++ code?).

To expose the class, we write a small piece of yaml:

circle:
  constructor:
    args: [radius: double]
  methods:
    area:
      return_type: double
  active:
    circumference:
      name_cpp: circumference
      name_cpp_set: set_circumference
      access: member
      type: double
    radius: {access: field, type: double}

After running RcppR6 on this, we can interact with objects of this type from R

obj <- circle(1)
obj$radius # 1
obj$radius <- 2
obj$radius # 2
obj$area() # 12.56637
obj$circumference <- 1
obj$circumference # 1
obj$radius # 0.1591549

A couple of notes here:

  • The name of the class is the top-most yaml key - in this case 'circle'. This will generate an R function circle that generates R6 objects.
  • There are three types of entities exported: the constructor (in contrast with Rcpp modules there can be only one), methods, and active-bound fields (which simulate data members in the R6 object but using these involves calling functions behind the scenes).
  • In contrast with Rcpp modules we must be explicit about types, and about where methods are found. C++ is notoriusly difficult to parse, and I've avoided trying to infer these from signatures (in contrast with Rcpp attributes). This leads to some undesirable doubling up of effort. Eventually I plan on using libclang to infer types when they are ommited, but this will be optional.
  • This is yaml, so the format is very flexible: the last active member could be equivalently written as:
    radius:
      access: field
      type: double

A full working version of this is available here; see in particular the class definition and the yaml.

Documentation

A vignette showing how the above example works is included in the package (vignette("introduction", package="RcppR6")) and rendered here

Slightly more useful examples are included in the examples vignette (vignette("examples", package="RcppR6")) and rendered here

See how to generate interfaces to templated types (vignette("templates", package="RcppR6"))

Still to come: generating big ugly parameters lists.

How is this run?

RcppR6 assumes you are building a package. There is currently no support for inline use. A file inst/include/RcppR6_classes.yml needs to exist with class definitions (though see Configuration, below). Running RcppR6::RcppR6() will generate a bunch of code, and re-run Rcpp attributes. The package can then be built as usual. Importantly, your package does not need to depend on RcppR6 at all -- once the code has been generated your package is independent of RcppR6.

When is this the right sort of thing to use?

  • You want reference semantics
  • You have existing C++ code to wrap, especially templated classes
  • You have time consuming code that you want to expose
  • You don't want to write lots of boilerplate glue, and Rcpp modules won't work for you

Why not use Rcpp modules?

  • Modules can be slow to load (on a complicated project we have load times of ~5s for a package that uses modules)
  • The compile times using modules can be slow, and the compiler error messages are inscruitiable
  • Support for templated classes is patchy
  • There is some sort of garbage collection issue, at least on OSX that prints warnings that seem to be harmless.
  • It is not currently under active development, with the author apparently having left Rcpp to work on Rcpp11, and removing modules from that version!

Requirements

Class definitions are written in YAML, and parsed using the yaml package, from CRAN.

The Rcpp R package is of course needed. Interfaces this way build a set of code that is then run through Rcpp's "attributes" facilities to build the actual R/C++ glue.

The R6 R package is the reference class that we use for wrapping the generated class. It's available on CRAN. It's in a state of flux though, so things may break.

Roxygen comments are propagaged from the class definition into the created R files: to do anything with these you need the devtools and roxygen2 packages and their dependencies.

Nothing is really documented about these yet, but see the example packages in tests/testthat.

Preparation

There are many requirements here, but almost all are really the same as work well for using Rcpp attributes. If you can use Rcpp attributes in your project, you're probably OK.

  1. DESCRIPTION: The package must have "Rcpp" listed under LinkingTo and under Imports. R6 must be listed under Imports. The Rcpp requirements here are standard for packages using Rcpp attributes. These will be set up automatically using RcppR6::install().

  2. NAMESPACE: Two requirements here:

  • Must import something from Rcpp. The Rcpp mailing list suggests importing evalCpp.
  • Must import something from R6. I suggest R6::R6Class.
  • Must load the package's dynamic library (of course) If you use roxygen these will be automatically set up for you by leaving the appropriate @importFrom directives in an R file.
  1. A file inst/include/<package_name>.h must exist ("main package header file"). This is the convention used by Rcpp attributes and is required for use by the LinkingTo convention. This file must include the definitions of classes that you want to export. It also needs to include two files:
  • inst/include/<package_name>/RcppR6_pre.hpp must be included after classes have been declared, but before Rcpp.h has been included. This is often a pain, especially if you want to use Rcpp types within the class. It may be sufficient to forward declare the classes that you export, but this will work badly with templated classes potentially (e.g., you can write class foo; but not class foo<bar>). This reason for this load order is outlined in the "Extending Rcpp" manual -- this file contains the prototypes for "non-intrusive extension".
  • inst/include/<package_name>/RcppR6_post.hpp, which may be included last in the main package header file (but must be included). Rcpp.h can be safely loaded before this file, and this file will itself include Rcpp.h if it has not been loaded.
  1. src/Makevars must be set up to add ../inst/include/ to the header search path so that we can find the main package header. This will be automatically added by RcppR6::install(), but the file can simply contain a line saying PKG_CPPFLAGS += -I../inst/include/

Installation/updating

We look after a bunch of files.

  • inst/include/<package_name>/RcppR6_pre.hpp
  • inst/include/<package_name>/RcppR6_post.hpp
  • inst/include/<package_name>/RcppR6_support.hpp
  • src/RcppR6.cpp
  • R/RcppR6.R

These files are entirely RcppR6's - don't add anything to them. Upgrades might totally alter these files at any point. There is a little warning at the top that indicates this! The contents of these files will morph and change, and running install() / RcppR6() may alter the contents of these files. This is similar to the strategy used by Rcpp attributes.

Configuration

A package may have a file inst/RcppR6.yml containing overall configuration information. If this file is absent, a default configuration is used. This is always available from RcppR6 (as.yaml(RcppR6:::config_default())) and is currently:

classes: inst/RcppR6_classes.yml

This indicates the files to search though. Multiple files can be given:

classes:
  - inst/part1.yml
  - inst/part2.yml

These will be read together before any processing happens, so the order does not matter. They are intepreted relative to the package root.

It's not totally clear that keeping these files in inst/ is the best bet, but seems preferable to many options. Having the file in inst means that it may be possible in future to define concrete versions of template classes defined in another package. If the file moves anywhere it will probably be into the root as .RcppR6.yml, which means that that file need adding to .Rbuildignore.