Template for Data Science, Machine Learning and Deep Learning projects.
Clone the repository into the desire folder
-
With
ssh
git clone git@github.com:LucasVmigotto/.datascience.git
-
With
HTTPS
git clone https://github.com/LucasVmigotto/.datascience.git
-
With GitHub CLI
gh repo clone LucasVmigotto/.datascience
This template project aims to help, and bootstrap, the development of data science projects creating an environment with commonly tools and necessities required - such as a Linux operational system, Jupyter Notebooks and LLM models.
Although you can clone and get started with only the Docker, it is highly recommended that you take advantage of the excellent tool that is the Visual Studio Code support to Docker's Containers based development with Dev Containers extension.
Inside .devcontainer
folder, there is a devcontainer.json
specification file that take care of providing you with all the tools early listed to a data science project. In case of need, it is possible to deactivate, separably, the services that can be ignored depending of the scenario. Thus, just comment in the runsServices
key that services you do not want to be initiated with the development container.
Copy and rename the .env.example
file to .env
.
Customize the values if necessary
-
Create a Docker Volume named
ollama
:docker volume create ollama
Creating a main volume, you will be able to share the models between others ollama containers.
This is a basic environment prepared to start some application development. It comes with Python, git
and zsh
with Oh My Zsh!
Jupyter (Jupyter)
With this service, you can connect to a Jupyter Environment and use it to test ideias in Jupyter Notebooks. It is possible to connect, when editing a .ipynb
file inside Visual Studio Code, to the Jupyter Server just informing the connection URL http://jupyter:8888/tree
This service has few direct use not considering the connection inside a Notebook consuming a model for example. But it is possible to direct interact with the service using the CLI interface with the following command:
docker compose exec ollama ollama run ollama3 # Or, any other model that has been pulled already before
The service, at first time, start without any model already downloaded. To download a model, you can make a request to the Ollama's API the pull the desired model. The following example shows how would be to pull the llama3 8B:
curl http://localhost:11434/api/pull \
-d '{ "model": "llama3" }'
This example consider that the command will be executed inside a terminal in the host.
If you want to execute inside a terminal in the Visual Studio Code, change the request URL to
http://ollama:11414/api/pull
- in this case, it is necessary to consider the hostname inside the Docker network that binds all the services.
You can acess locahost:8080 to get access into the Open WebUI visual interface and test the models pulled with Ollama.
You can acess locahost:7474 to use the graphic interface and try some queries with the Cipher query language.
You can acess locahost:9200 to verify the execution.
List all Docker containers
docker ps -a
Remove Docker Compose containers
docker compose rm --stop -f
Prune containers
docker container prune --force
List all Docker images
docker ls -a
Remove Docker dangling images
docker image rm -f $(docker image ls --filter "dangling=true" -aq)
List all Docker volumes
docker volume ls
Prune Docker volumes
docker volume prune --force
WARNING: If you want to remove ALL Docker images, just remove the
--filter
flag and argument
docker image rm -f $(docker image ls -aq)