/docker-rstudio-mongodb

Dockerized RStudio and MongoDB integration

Primary LanguagePythonGNU General Public License v2.0GPL-2.0

docker-rstudio-mongodb

Dockerized RStudio and MongoDB integration to play around.

About

This project aims towards automatically setting up a dockerized multi-application Data Science platform with RStudio and MongoDB.

Prerequisites

  • installed Docker Engine and Docker Compose
  • installed python & pip (only needed to load the sample data into MongoDB)
  • pandas & pymongo package (only needed to load the sample data MongoDB)

Getting Started

$ git clone https://github.com/wipatrick/docker-rstudio-mongodb.git
$ cd docker-rstudio-mongodb/
$ docker-compose up -d

As default RStudio initializes the following credentials according to the specification in the docker-compose.yml:

  • username: 'testuser'
  • password: 'testpassword'

Check if the instances of RStudio and MongoDB are running correctly.

$ docker ps                                                                   
CONTAINER ID        IMAGE                          COMMAND                  CREATED             STATUS              PORTS                      NAMES
2399140168c6        dockerrstudiomongodb_rstudio   "/usr/bin/supervisord"   13 minutes ago      Up 13 minutes       0.0.0.0:80->8787/tcp       rstudio
c14ac842ab18        dockerrstudiomongodb_mongodb   "/entrypoint.sh mongo"   13 minutes ago      Up 13 minutes       0.0.0.0:27017->27017/tcp   mongodb

Before loading the sample data into dockerized MongoDB edit data2mongo.py according to your setup (changing: IP address to your Docker Host) and execute.

$ python data2mongo.py                                                        
{'obs': 1.0, 'satv': 600.0, 'hse': 10.0, 'gpa': 3.3199999999999998, 'hsm': 10.0, 'hss': 10.0, 'sex': 1.0, 'satm': 670.0}
{'obs': 2.0, 'satv': 640.0, 'hse': 5.0, 'gpa': 2.2599999999999998, 'hsm': 6.0, 'hss': 8.0, 'sex': 1.0, 'satm': 700.0}
{'obs': 3.0, 'satv': 530.0, 'hse': 8.0, 'gpa': 2.3500000000000001, 'hsm': 8.0, 'hss': 6.0, 'sex': 1.0, 'satm': 640.0}

Switch over to you designated browser on http://<DockerHost-IP> and login with the specified credentials. Once logged in, you can connect to MongoDB with the pre-installed RMongo-package for R as follows.

library(RMongo)
Loading required package: rJava
> mongo <- mongoDbConnect("db", "mongodb")
> print(dbShowCollections(mongo))
[1] "system.indexes" "test"          
> query <- dbGetQuery(mongo, "test", "{'satv': {'$lt': 500}}")
> summary(query)
      hse              obs             sex            X_id                hss              gpa       
 Min.   : 3.000   Min.   :  7.0   Min.   :1.000   Length:113         Min.   : 4.000   Min.   :0.120  
 1st Qu.: 7.000   1st Qu.: 74.0   1st Qu.:1.000   Class :character   1st Qu.: 7.000   1st Qu.:2.140  
 Median : 8.000   Median :125.0   Median :1.000   Mode  :character   Median : 8.000   Median :2.620  
 Mean   : 7.735   Mean   :124.8   Mean   :1.407                      Mean   : 7.628   Mean   :2.564  
 3rd Qu.: 9.000   3rd Qu.:185.0   3rd Qu.:2.000                      3rd Qu.: 9.000   3rd Qu.:3.070  
 Max.   :10.000   Max.   :224.0   Max.   :2.000                      Max.   :10.000   Max.   :4.000  
      satv            satm            hsm        
 Min.   :285.0   Min.   :300.0   Min.   : 2.000  
 1st Qu.:400.0   1st Qu.:505.0   1st Qu.: 7.000  
 Median :440.0   Median :570.0   Median : 8.000  
 Mean   :432.9   Mean   :566.7   Mean   : 8.027  
 3rd Qu.:470.0   3rd Qu.:630.0   3rd Qu.: 9.000  
 Max.   :490.0   Max.   :740.0   Max.   :10.000

As you can see, establishing a connection to your MongoDB Docker instance can be done by simply calling it with its defined name in docker-compose.yml due to Dockers Container-Links feature.

Credits

Credits belong to rocker-org and the docker-library team working on MongoDB for their pre-work.