A devcontainer for Data Engineers that sets up a development environment with tools such as Python, Terraform, dbt, Jupyter Notebooks for VS Code, the Snowflake extension for VS Code, and the SQL Server extension. This can be used with both VS Code and GitHub Codespaces.
To continue, make sure you have Docker Desktop installed OR use GitHub Codespaces.
Option 1: Local VS Code
- Clone the repo and connect to it in VS Code:
$ cd your/desired/repo/location
$ git clone https://github.com/MartyC-137/DataEng_devcontainer
-
Download the
Dev Containers
extension from the VS Code marketplace -
Press Cmd + Shift + P (Mac) or Ctrl + Shift + P (Windows) to open the Command Pallette. Type in
Dev Containers: Open Folder in Container
and select the repo directory. -
Wait for the container to build and the dependencies to install
Option 2: GitHub Codespaces
-
Fork this repo
-
From the repo page in GitHub, select the green
<> Code
button and choose Codespaces -
Click
Create Codespace on Main
, or checkout a branch if you prefer -
Wait for the container to build and the dependencies to install
-
Start developing!
Python 3.9
Pandas
SQLAlchemy
PySpark
PyArrow
Polars
Prefect
and all required Python dependenciesconfluent-kafka
scikit-learn
Snowpark
ipykernel
Google Cloud SDK
Azure
CLIGitHub
CLIGitLens
GitHub
Pull Requestsdbt-core
dbt-postgres
dbt-bigquery
dbt
extensions for VS CodeSnowflake
for VS CodeMS SQL Server
for VS CodeTerraform
Jupyter Notebooks
for VS CodeDocker
Spark
JDK
version 11XML
toolsYAML
toolsOh-My-Posh Powershell themes
- Popular VS Code themes (GitHub, Atom One, Material Icons etc.)
Feel free to modify the Dockerfile
, devcontainer.json
or requirements.txt
file to include any other tools or packages that you need for your development environment.
If you'd like to contribute to this project, please open a pull request and I'll review it as soon as possible.