AWS Glue tute
docker compose up -d
docker compose ps
docker compose logs -f
(ctrl+c)
- Go to http://localhost:8888
Makefile
Alternatively, use make
based dev workflow
make up
make ps
make logs
make start
make enter
make down
...
- Look Makefile targets for more dev routines
Quickstart
make enter
cd jupyter_workspace
spark-submit quickstart.py
JupyterLab
- Go to JupyterLab UI,
File > New > Terminal
jupyter kernelspec list
pyspark
>>> exit()
Sample ETL
cd jupyter_workspace/sample_etl
spark-submit sample.py
pytest
AWS
- While at JupyterLab Terminal, try as follows to test AWS account setup works.
aws sts get-caller-identity
aws s3 ls s3://awsglue-datasets/examples/us-legislators/all/persons.json
-
If you get any error from above AWS commands, probably revisit docker-compose.yml to check that you do have
AWS_PROFILE
calleddev
and, set the rightAWS_REGION
to your case. -
You can use docker-compose.override.arm64.yml to override settings specific to your case.
Quickstart Notebook
- Menu:
File > Open from Path...
and, enterquickstart.ipynb
when prompt. - Menu:
Kernel > Change Kernel...
and, selectPySpark
kernel from dropdown list.