This is a fork from Dani's work (please see below for citing) to remove R as we don't need this for teaching but do have a few more Python packages that we do use. We've also added some JupyterLab extensions to make interacting with the Lab server a bit easier.
We previously experimented with four approaches to installation: VirtualBox; Vagrant; Docker; and Anaconda Python directly. Each of these has pros and cons, but after careful consideration we have come to the conclusion that Docker is the most robust way to ensure a consistent experience in which all students end up with the same versions of each library, difficult-to-diagnose hardware/OS issues are minimised, and running/recovery is the most straightfoward.
A more detailed set of instructions can also be found in Dani's Repo. Read this if you have trouble!
You are strongly encouraged to use the Docker image instead of installing Anaconda Python directly. The basic reason for this is that you may encounter installation errors or version differences that mean your experience of running the Spatial Data Science environment is seriously impaired. Furethermore, we are not in a position to provide support for the wide variety of platforms (hardware and software) that students may present.
If you really want to install natively, despite everything we said above, then you will need Anaconda Python (Python 3 64-bit) to be able to install the programming environment.
If you are using Mac OS, you can download Anaconda directly from here and then install it.
If you are using Windows 10 or 11, things are a bit trickier. Using Anaconda on Windows is not pleasant, as many packages are only available for Unix/Linux, which makes it hard to configure the Anaconda environment. Moreover, using the Windows CMD or powershell or Anaconda prompt is unpleasant. Therefore, we recommend using Miniconda for WSL (Windows Subsystem for Linux) on your Windows machine.
The installation of Miniconda for WSL consists of the following steps:
- Install WSL (Ubuntu for Windows) and Windows Terminal on Windows, following this link.
- Start a Windows Terminal of Ubuntu. Note that all following steps are on Windows Terminal instead of CMD or Anaconda prompt.
- Go to https://repo.anaconda.com/miniconda/ to find the list of Miniconda releases. Select the latest release for your machine. I have a 64-bit x86 computer, so I choose Miniconda3-latest-Linux-x86_64.sh. If you have a 32-bit computer, you would select Miniconda3-latest-Linux-x86.sh.
- From the terminal run
wget https://repo.anaconda.com/miniconda/[YOUR VERSION]
. Example:wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
. - Run the installation script:
bash Miniconda3-latest-Linux-x86_64.sh
. Replace the file name of .sh if needed. - Read the license agreement and follow the prompts to accept. When asks you if you'd like the installer to prepend it to the path, say yes.
- Reload the .bash configs so WSL knows where the conda is installed:
source ~/.bashrc
. Mine is /home/user_name/anaconda3/bin/python. If it doesn’t have anaconda in the path, do the next step. Otherwise, skip the next step. - Manually add the Anaconda bin folder to your PATH. To do this, I added "export PATH=/home/user_name/anaconda3/bin:$PATH" to the bottom of my ~/.bashrc file. Do replace user_name with your username.
Note: Miniconda is a free minimal installer for conda. It is a lighter version of Anaconda and makes configuring a new environment easy.
Note: if you have installed Anaconda for Windows, you don’t need to uninstall it before installing conda for WSL. These two pieces are separate.
After downloading and installing Anaconda Python you will also need to download the environment's [configuration file](https ://raw.githubusercontent.com/jreades/sds_env/master/conda/environment_py.yml). This file (known as a 'YAML file') tells Anaconda Python what versions of what libraries to install on your computer. The idea is that all users end up with the same versions of the key programming libraries.
You will then need to work out how to use the Terminal (Mac in order to navigate to the folder holding the downloaded configuration file. It will be something like cd ~/Downloads/
to reach your downloads folder.
At this point you may start the installation by typing:
conda-env create -n sds2021 -f environment_py.yml
And then hit the return key to run the command.
To make this new 'configuration' visible in JupyterLab you then need to run the following two commands...
conda activate sds2021
python -m ipykernel install --name sds2021 --display-name "CASA2021"
Note: when you connect to Jupyter, you should see a second tile called CASA2021
. Users of Docker will see only Python3
. You should always use the CASA2021
tile (which represents a separate computing environment) in Anaconda instead of the default Python3
tile.
Note: if you get a warning of ‘No permission’ because of the above commands, please add sudo
to that command and run it again. You would need to input the password for WSL or Mac.
Still using the Terminal type (Windows users, please use Windows Terminal):
conda activate sds2021
jupyter lab
Do not run Jupyter Lab from the Anaconda Navigator since it does not configure the spatial analysis libraries correctly.
- Start up the UCL VPN.
- Connect to JupyterHub
- Authenticate using UCL credentials.
- Create a new terminal: File > New > Terminal
I now think that these instructions are not correct (see below for the alternative) in the sense the use of a symlink can cause problems and duplicated environments down the line. Anyway, type the following, but note that you need to replace ...
with the appropriate path (this will be obvious logged in):
course_name="casa0013"
ln -s /shared/.../casa/${course_name} $HOME/${course_name}
conda config --add envs_dirs /shared/groups/.../casa/${course_name}/envs
curl -o /tmp/casa0013.yml https://raw.githubusercontent.com/jreades/sds_env/master/conda/environment_py.yml
conda env create -n casa0013 -f /tmp/casa0013.yml
I now think that the correct way to do this is:
course_name="casa0013"
conda config --add envs_dirs /shared/groups/.../casa/envs
curl -o /tmp/casa0013.yml https://raw.githubusercontent.com/jreades/sds_env/master/conda/environment_py.yml
conda env create -p /shared/groups/.../casa/envs -f /tmp/casa0013.yml
However, note that this now means you have .../casa/casa0013/envs/casa0013...
so it might be more sensible to set envs_dirs
to just ...casa/envs
and then have per-module environments underneath that.
Two shortcomings in the existing approach of generating environment_py.yml
were identified and need to be tweaked in the Makefile:
- Remove anything with ‘linux’ in it
- Remove SOMPY and
mrmr
- Remove version from gitpython.
- Remove python-graphviz entirely.
Additional issues may exist with replication to non-Linux systems.
To connect to JupyterHub:
- Start up the UCL VPN.
- Connect to JupyterHub
- Authenticate using UCL credentials.
- If you see a URL that ends in
tree?
please replace this withlab?
to get the JupyterLab interface and not the original Jupyter Notebook interface. - Create a new terminal: File > New > Terminal
Note that you need to replace ...
with the appropriate path (this will be obvious logged in):
course_name="casa0013"
conda config --append envs_dirs /shared/groups/.../casa/envs
jupyter contrib nbextension install --user
- Add mgwr to image?
This draws heavily on Dani Arribas-Bel's work for Liverpool. If you use this, you should cite him.
@software{hadoop,
author = {{Dani Arribas-Bel}},
title = {\texttt{gds_env}: A containerised platform for Geographic Data Science},
url = {https://github.com/darribas/gds_env},
version = {3.0},
date = {2019-08-06},
}