/bc_osc_rstudio_spark

Batch Connect - OSC RStudio Server + Spark

Primary LanguageShellMIT LicenseMIT

Batch Connect - OSC RStudio Server + Spark

GitHub Release GitHub License

An interactive app designed for OSC OnDemand that launches an RStudio Server and an Apache Spark cluster within an Owens batch job.

Prerequisites

This Batch Connect app requires the following software be installed on the compute nodes that the batch job is intended to run on (NOT the OnDemand node):

  • R 3.3.2+ (earlier versions are untested but may work for you)
  • RStudio Server 1.0.136+ (earlier versions are untested but may work for you)
  • PRoot 5.1.0+ (used to setup fake bind mount)
  • Apache Spark 2.1.0+
  • sparklyr 0.6.4+

Optional software:

  • Lmod 6.0.1+ or any other module purge and module load <modules> based CLI used to load appropriate environments within the batch job before launching the RStudio Server and Apache Spark cluster.

Install

Use Git to clone this app and checkout the desired branch/version you want to use:

scl enable git19 -- git clone <repo>
cd <dir>
scl enable git19 -- git checkout <tag/branch>

You will not need to do anything beyond this as all necessary assets are installed. You will also not need to restart this app as it isn't a Passenger app.

To update the app you would:

cd <dir>
scl enable git19 -- git fetch
scl enable git19 -- git checkout <tag/branch>

Again, you do not need to restart the app as it isn't a Passenger app.

Contributing

  1. Fork it ( https://github.com/OSC/bc_osc_rstudio_spark/fork )
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create a new Pull Request