/rojure

Use Rserve to call R from Clojure

Primary LanguageClojure

Welcome to rojure

About

Main difference from original rincanter is that this project does not require playing with native libraries as it uses a socket connection to the R interpreter.

Rincanter was offering translation between the incanter 1.5.6 data types (matrix,dataset). These datatype have now been moved to clojure.core.matrix (and incanter > 1.9.0 uses them internally now) and hence this project could be made independent from incanter, so I changed the name from rincanter to rojure.

Rojure can be used from the incanter irepl, GorillaRepl or the normal clojure repl. It only has depedencies towards clojure.core.matrix. (apart from JRI for R interoperability)

As the original version, it also offers translation between Clojure and R data types such as R dataframe to core.matrix.matrix.dataset.

Installation

Install R for your platform

The directions for installing R are outside the scope of this document, but R is well supported on most platforms, and has great documentation: R Project Page

Install and launch Rserve

From R execute following lines:

install.packages("Rserve")
library(Rserve)
Rserve()

Add rojure dependency to project.clj

[rojure "0.2.0"]

Example Usage

The main entry points are the functions:

For the higher level use case of dataset->dataset transformation in R, I added a function ‘r-transform-ds’, which takes a clojure dataset and an R file as input, executes the R file and returns a new dataset

r-eval

You can play around with Clojure and R in the same REPL session:

 (use '(rojure core))

;; define connection to R (needs running R with RServe started)
 (def r (get-r))

 (r-eval r "data(iris)")

 ;;eval's the iris dataframe object, converts into
 ;;incanter dataset
 (r-eval r "iris")
 
 ;;create vector on R side
 (r-eval r "vec_in_r = c(1,2,3)")
 
 ;;now retrieve it, converting to Clojure vector
 (r-get r "vec_in_r")

plotting:

(use '(rojure core))

 ;; define connection to R (needs running R with RServe started)
(def r (get-r))

(r-eval r "data(iris)")

;;initialize the R graphics device for your system:
;;For Mac OS X
(r-eval "quartz()")
;;windows: 
(r-eval "windows()")
;;unix/linux
(r-eval r "x11()")

;;create the plot using values from the iris dataset
(r-eval r "plot(Sepal.Length ~ Sepal.Width, data = iris)")
;;alter this existing plot
(r-eval r "title(main = \"Iris Sepal Measurements\")")

;; close graphic device
 (r-eval r "dev.off()")

with-r-eval

Using with-r-eval, it is even easier. Within this form, all forms enclosed in parenthesis are evaluated as normal Clojure forms, strings are evaluated in R using r-eval:

(use '(rojure core))

(with-r-eval 
  "data(iris)"

  ;;eval's the iris dataframe object, converts into
  ;;incanter dataset
  "iris"
 
  ;;create vector on R side
  "vec_in_r = c(1,2,3)"

  ;;now retrieve it, converting to Clojure vector
  (r-get "vec_in_r"))

r-transform-ds

This use-case has in mind to allow seamlessly editing of Clojure code side-by-side with R code. As the R code is in it’s own .R file, it can be edited by whatever R IDE (Emacs, Rstudio)

I assume that a lots of uses cases of integrating R into Clojure can be expressed as dataframe->dataframe transformations executed in R. I believe this is general enough to do arbitrary computations in R, the result needs just to be transformed to a data.frame at the end.

In the future version I might add a similar function for matrix->matrix transformations.

The R script executed by ‘r-transfrom-ds’ just needs to follow this conventions:

  • It need to be able to run standalone
  • It assumes that a variable in_ is present in R session (and nothing else)
  • It needs to set an variable out_ into the R session (probably at the end)

When working with the R script standalone, the user just needs to make sure that ‘in_’ is present in his development R session.

To ease debugging, the r-transform-ds function writes both R variables (“in_” and “out_”) to disk in rds format, so they can be read in the development R session easily with “readRDS(‘in_.rds’)” for inspection. This allows to keep a rather smooth work flow for working in Clojure and R together.

(use '(rojure core))
(use '(clojure.core.matrix dataset))

;; define connection to R (needs running R with RServe started)
(def r (get-r))

;; define the input ds to transform
(def ds (dataset [[1 2 3][4 5 6]]))
 
;; sent input ds to R and execute R script 
;; (R script receives ds in variable "in_" and needs to produce a variable "out_")
;; both in_ and out_ are serialised to disc, to ease debuging
(def out-ds (r-transform-ds r ds "./count.R"))

;;out-ds is an core.matrix dataset
out-ds

;;count.R looks like this:
library(tidyverse)

out_ <- in_ %>%
  count

 ;; in an separate R session the user could now test / develop the R code, by executing
 in_ <- readRDS("in_.rds")
 source("./count.R")   ;; or step interactively over the lines of the R script
 

For matrices it work in the same way:

(use '(rojure core))
(use '(clojure.core.matrix dataset))

;; define connection to R (needs running R with RServe started)
(def r (get-r))

;;define matrix to transform 
(def m (clojure.core.matrix/matrix [[1 2] [3 4]]))

;; transform matrix with R
(def eig (r-transform-ds r m "./eigen.R"))
eig 

;;eigen.R looks like this:
out_ <- eigen(in_)$vectors

Documentation

API Documentation

API Documentation for rincanter is located at: Rincanter API