The docker-jupyter
repository holds example Senzing
Jupyter
notebooks in the
notebooks
subdirectory.
The senzing/jupyter
docker image is a Senzing-ready image hosting
the example Senzing notebooks.
These notebooks are built upon the DockerHub Jupyter organization docker images. The default base image is jupyter/minimal-notebook. There is more information on the Jupyter Docker Stacks.
In addition, the Jupyter notebooks can be viewed on nbviewer.jupyter.org. For example, visit Senzing examples on NbViewer.
- 🤔 - A "thinker" icon means that a little extra thinking may be required. Perhaps you'll need to make some choices. Perhaps it's an optional step.
- ✏️ - A "pencil" icon means that the instructions may need modification before performing.
⚠️ - A "warning" icon means that something tricky is happening, so pay attention.
This repository and demonstration require 9 GB free disk space.
Budget 40 minutes to get the demonstration up-and-running, depending on CPU and network speeds.
This repository assumes a working knowledge of:
- If Senzing has not been initialized, visit "How to initialize Senzing with Docker".
Configuration values specified by environment variable or command line parameter.
Non-Senzing configuration can be seen at Jupyter Docker Stacks
- JUPYTER_NOTEBOOKS_SHARED_DIR
- SENZING_DATA_VERSION_DIR
- SENZING_ETC_DIR
- SENZING_G2_DIR
- SENZING_NETWORK
- SENZING_RUNAS_USER
- SENZING_VAR_DIR
🤔 "How to initialize Senzing with Docker" places files in different directories. The following examples show how to identify each output directory.
-
Example #1: To mimic an actual RPM installation, identify directories for RPM output in this manner:
export SENZING_DATA_VERSION_DIR=/opt/senzing/data/1.0.0 export SENZING_ETC_DIR=/etc/opt/senzing export SENZING_G2_DIR=/opt/senzing/g2 export SENZING_VAR_DIR=/var/opt/senzing
-
✏️ Example #2: If Senzing directories were put in alternative directories, set environment variables to reflect where the directories were placed. Example:
export SENZING_VOLUME=/opt/my-senzing export SENZING_DATA_VERSION_DIR=${SENZING_VOLUME}/data/1.0.0 export SENZING_ETC_DIR=${SENZING_VOLUME}/etc export SENZING_G2_DIR=${SENZING_VOLUME}/g2 export SENZING_VAR_DIR=${SENZING_VOLUME}/var
-
🤔 If internal database is used, permissions may need to be changed in
/var/opt/senzing
. Example:sudo chown $(id -u):$(id -g) -R ${SENZING_VAR_DIR}
🤔 Optional: Use if docker container is part of a docker network.
-
List docker networks. Example:
sudo docker network ls
-
✏️ Specify docker network. Choose value from NAME column of
docker network ls
. Example:export SENZING_NETWORK=*nameofthe_network*
-
Construct parameter for
docker run
. Example:export SENZING_NETWORK_PARAMETER="--net ${SENZING_NETWORK}"
🤔 Optional: Some database need additional support. For other databases, these steps may be skipped.
- Db2: See
Support Db2
instructions to set
SENZING_OPT_IBM_DIR_PARAMETER
. - MS SQL: See
Support MS SQL
instructions to set
SENZING_OPT_MICROSOFT_DIR_PARAMETER
.
-
✏️ Set environment variables. Example:
export JUPYTER_NOTEBOOKS_SHARED_DIR=$(pwd) export WEBAPP_PORT=8888
-
🤔 Optional: Run Jupyter without token authentication. Example:
export JUPYTER_PARAMETERS="start.sh jupyter notebook --NotebookApp.token=''"
-
Run docker container. Example:
sudo docker run \ --interactive \ --name test-senzing-jupyter \ --publish ${WEBAPP_PORT}:8888 \ --rm \ --tty \ --volume ${JUPYTER_NOTEBOOKS_SHARED_DIR}:/notebooks/shared \ --volume ${SENZING_DATA_VERSION_DIR}:/opt/senzing/data \ --volume ${SENZING_ETC_DIR}:/etc/opt/senzing \ --volume ${SENZING_G2_DIR}:/opt/senzing/g2 \ --volume ${SENZING_VAR_DIR}:/var/opt/senzing \ ${SENZING_NETWORK_PARAMETER} \ ${SENZING_OPT_IBM_DIR_PARAMETER} \ ${SENZING_OPT_MICROSOFT_DIR_PARAMETER} \ senzing/jupyter ${JUPYTER_PARAMETERS}
-
If no token authentication, access your jupyter notebooks at: http://127.0.0.1:8888/
-
If token authentication, locate the URL in the Docker log. Example:
Copy/paste this URL into your browser when you connect for the first time, to login with a token: http://(a152e5586fdc or 127.0.0.1):8888/?token=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Adjust the URL. Example:
http://127.0.0.1:8888/?token=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Paste the URL into a web browser.
The Jupyter notebooks in notebooks/senzing-examples are of two types:
- References - Information on specific method invocations and their parameters. Examples:
- Guides - Illustrations of how to use methods to accomplish tasks. Often points to appropriate "Reference" entries for specific method invocations. Examples:
The following software programs need to be installed:
For more information on environment variables, see Environment Variables.
-
Set these environment variable values:
export GIT_ACCOUNT=senzing export GIT_REPOSITORY=docker-jupyter export GIT_ACCOUNT_DIR=~/${GIT_ACCOUNT}.git export GIT_REPOSITORY_DIR="${GIT_ACCOUNT_DIR}/${GIT_REPOSITORY}"
-
Follow steps in clone-repository to install the Git repository.
-
Set environment variables for senzing directories. See Volumes. Example:
export SENZING_VOLUME=/opt/my-senzing export SENZING_DATA_DIR=${SENZING_VOLUME}/data export SENZING_DATA_VERSION_DIR=${SENZING_DATA_DIR}/1.0.0 export SENZING_ETC_DIR=${SENZING_VOLUME}/etc export SENZING_G2_DIR=${SENZING_VOLUME}/g2 export SENZING_VAR_DIR=${SENZING_VOLUME}/var
-
Set environment variables. Example:
export PYTHONPATH=${SENZING_G2_DIR}/python export LD_LIBRARY_PATH=${SENZING_G2_DIR}/lib:${SENZING_G2_DIR}/lib/debian export SENZING_SQL_CONNECTION="sqlite3://na:na@${SENZING_VAR_DIR}/sqlite/G2C.db"
-
Start juypter notebook. Example:
cd ${GIT_REPOSITORY_DIR} jupyter notebook
-
Option #1: Using
docker
command and GitHub.sudo docker build --tag senzing/jupyter https://github.com/senzing/docker-jupyter.git
-
Option #2: Using
docker
command and local repository.cd ${GIT_REPOSITORY_DIR} sudo docker build --tag senzing/jupyter .
-
Option #3: Using
make
command.cd ${GIT_REPOSITORY_DIR} sudo make docker-build
Note:
sudo make docker-build-development-cache
can be used to create cached docker layers.
- See docs/errors.md.