/saspyrilia-sandbox

saspyrilia sandbox

Primary LanguageJupyter NotebookApache License 2.0Apache-2.0

saspyrilia-sandbox

Docker

Repository contains code examples used for testing content at saspyrilia.com.

In addition, repository contains Dockerfile to build a single Docker image that contains Jupyter Lab and kernels for all four principal programming languages featured on saspyrilia.com.

  • SAS
  • Python
  • R
  • Julia

Over time will consider adding other kernels for other languages.

Code Examples

If interested in code examples, the directory structure should mirror the docs section of the saspyrilia repo.

📝 NOTE: Transition from individual *.sas, *.py, *.r, and *.jl files to Jupyter notebooks is not yet complete! Over time all individual files will transition to notebooks.

Sandbox

If interested in setting up a similar sandbox, the below describes the requirements for using with SAS, how to build the Docker image, how to run the Docker container, and a final caveat for working with SAS ODA within the container.

SAS Requirements

Running SAS as built out in this repository requires creating an account on SAS On Demand for Academics.

All SAS code is evaluated within SAS On Demand for Academics (SAS ODA). Within a local Jupyter notebook, code is sent to SAS ODA via the saspy library — a set of Python APIs for working with SAS. Instructions for getting setup to utilize Jupyter and SAS ODA can be summarized as follows:

  • Install and setup Java
  • Install the saspy and sas_kernel Python packages
  • Create a sascfg_personal.py file and copy it to the appropriate location
  • Create an authinfo file containing the username and password for SAS ODA

In order to prevent leaking secrets into the Docker container, the username and password values are expected to be passed into the Docker container via the ODA_USER and ODA_PASSWORD environment variables when the Docker container is started. The environment variables may be set globally or on a per-run basis (my preference).

Docker Build

In order to build the Docker image, utilize the repo Dockerfile. Assuming one is in the saspyrilia-sandbox directory, run the following.

docker build --tag saspyrilia-sandbox .

The Dockerfile is based on the datascience-notebook maintained by the Jupyter team. See the Jupyter Docker Stacks page for other images available and for detailed instructions for utilizing the images.

⚠️ WARNING: This creates a rather large — 4GB+ — image!

Docker Startup

Python, R, and Julia

Starting the Docker container is as simple as running the shell script sandbox-startup.sh in the root of this repository.

./sandbox-startup.sh

This will startup a Jupyter notebook that one may access via their web browser at https://127.0.0.1:8888 plus an ephemeral token.

SAS, Python, R, and Julia

In order to make use of SAS, one needs to pass in the username and password for SAS ODA. As noted in SAS Requirements, this is passed in via the ODA_USER and ODA_PASSWORD environment variables. If these variables are set globally, then startup is the same as above.

./sandbox-startup.sh

If these variables are set per-run (my preference) then they may be passed in as part of startup.

NOTE: There is a leading space in front of the command example below — this is purposeful! This prevents the command from being saved in one's shell history. Note that this works for both bash and fish shells.

# v--- There is a leading space here!
>  ODA_USER=myusername ODA_PASSWORD=mypassword ./sandbox-startup.sh

File Availability

All files in the repository directory are bind mounted inside the container within ~/work. Thus any new notebooks created / modified underneath ~/work will persist once the container is stopped.

Usage

Python, R, and Julia

Using Python, R, and Julia is as simple as opening a notebook with the appropriate kernel! All computation is performed locally.

SAS

As noted within SAS Requirements, all computation actually takes place at SAS ODA. Thus, one will need setup their ~/.authinfo file to make the needed connection to SAS ODA.

The ~/.authinfo file within the container at startup contains the following.

oda user ODA_USER password ODA_PASSWORD

This is intentional! In order to update the file, one may utilize the script update-authinfo.sh that is copied into ~/bin/update-authinfo.sh where ~/bin is setup on the user's PATH. The script utilizes sed to replace ODA_USER and ODA_PASSWORD with the environment variables of the same name.

Usage of update-authinfo.sh is as simple as running the following in the first cell within a notebook with the SAS kernel.

!~/bin/update-authinfo.sh