/AWS_Redshift_Postgresql

redshift cluster analysis with postgresql database

Primary LanguagePython

AWS Redshift Cluster

ZAGI retail company sales department database


STEPS

Note: Includes links to downloads/instructions; varies according to OS (mine is Windows); single ETL cluster

  1. ZAGDB (.sql in this repo for reference)

  2. PostgreSQL (https://www.postgresql.org/download/)

  3. pfAdmin (https://www.pgadmin.org/download/pgadmin-4-windows/)

  4. Create ZAGDB database on pdAdmin

    • Note down credentials
    • CREATE TABLE
    • INSERT VALUES

  1. PyCharm (https://www.jetbrains.com/pycharm/download/)
    • Back-End coding!

Data extracted and saved

  1. Amazon AWS (https://aws.amazon.com/)

  2. AWS S3 (https://s3.console.aws.amazon.com/s3/home?region=us-east-2)

Data loaded on AWS S3 Bucket

  1. AWS Redshift Cluster steps 1-5 (https://docs.aws.amazon.com/redshift/latest/gsg/getting-started.html)

Note down the username and password

  1. AWS Redshift (https://us-east-2.console.aws.amazon.com/redshift/home?region=us-east-2#)

Ways to query:

  • Redshift Query Editor:

  • PyCharm Execute Query

Virtual Environment:

  • Extra option if switching between different versions.
  • Python virtual environments allow developers to control software dependencies in Python code. They're useful ways of ensuring that the correct package/library versions are consistently used every time the software runs. Virtual environments also help ensure that the results from running code are reproducible.


If you would like to discuss my project or any new opportunities, please email me at p.ankur.715@gmail.com or https://www.linkedin.com/in/ankurpatel715/.