retail_db

This contains data sets which are copied from Cloudera Quick Start VM.

Here are the instructions to setup this repository.

  • Clone the repository git clone https://github.com/dgadiraju/retail_db.git
  • It will create folder called as retail_db.
  • Folder contains 6 sub folders
    • customers
    • departments
    • categories
    • products
    • orders
    • order_items
  • Files are of type text file. Records are delimited by new line character and fields with in each record are delimited by comma.

You can also create tables with all relationships and load the data into all the tables by using create_db.sql.

You can sign up for our courses to learn about Spark, kafka and other important technologies by clicking here.