/mbd-bidw

Business Intelligence and Data Warehousing

GNU General Public License v3.0GPL-3.0

Business Intelligence and Data Warehouse Course

This repository contains all necessary inputs to run the course hands-on labs.

Repository contents (by session)

  • Additional articles and documents
  • MySQL Workbench Schemas
  • ETL processes
  • Datasets
  • Tableau files
  • Videos

Software Installation

  • Data Warehouse: MySQL (database) and MySQL Workbench (database modeling and SQL development)
  • ETL: Pentaho Data Integration (PDI)
  • Business Intelligence/Data Visualization: Tableau Desktop
  • Self-Service of Data Lakes: Dremio

Steps

Install Java

Install MySQL

  • Download the right version of MySQL and MySQL Workbench for your OS (in our case: MySQL Community Server 8.0.13 and MySQL Workbench 8.0.13). Download the program(s):
  • Install all the programs and follow the instructions:
    • [Windows] During the installation process you will configure the password for root user (choose IEMBD2018 or a password that you will remember). Consider a custom installation and choose just the MySQL Server and MySQL Workbench as components to be installed.
    • [Mac] During the installation process you will configure the password for root user (choose IEMBD2018 or a password that you will remember). If you forget the password you will be able to change it from system preferences.
    • PDI and MySQL Workbench only supports legacy password encription, not the new strong encription available in MySQL 8.

Note: for Microsoft Windows it is just one installer for MAC, two files.

Remember to start the server to be able to use the database. Open MySQL Workbench and create a new connection using the right user and password and the standard parameters for configuration.

Install PDI

We will use the community version of Pentaho Data Integration (a.k.a PDI). It can be downloaded from this link (in our case: pdi-ce-8.1.0.0-365.zip).

  • Download the file and unzip.
    • [Mac] Move the data-integration folder into Applications folder
    • [Windows] Move the data-integration folder into C:/ folder
  • Open PDI
    • [Windows] Double-click spoon.bat inside data-integration folder
    • [Mac] Open the terminal and execute:
cd /Applications/data-integration/
./spoon.sh
  • [Optional] Activate data-integration.app as a double-click app using the terminal:
sudo xattr -dr com.apple.quarantine /Applications/data-integration/Data\ Integration.app
  • Install MySQL 5.X plugin for PDI:
    • Open PDI
    • Go the tools menu > Marketplace > MySQL Plugin and install
    • Restart PDI
  • Configuring a JDBC Connection to MySQL 8.x Using PDI:
    • Download the MySQL 8.x JDBC driver (platform independent, zip) to the computer running Pentaho from: https://dev.mysql.com/downloads/connector/j/
    • Unzip the file mysql-connector-java-8.0.13.zip
    • Copy mysql-connector-java-8.0.13.jar to the Pentaho lib folder. [Windows]: C:\data-integration\lib. [Mac OS]: …/Applications/data-integration/lib
    • Configure a Generic Database connection in Pentaho: (1) Connection jdbc:mysql://localhost:3306/<database_name> (2) Driver Class Name: com.mysql.cj.jdbc.Driver (3) use the previous user and password
    • In case the server time zone value 'AEST' is unrecognized or represents more than one time zone, then consider: jdbc:mysql://localhost:3306/<database_name>?useLegacyDatetimeCode=false&serverTimezone=UTC

Install Tableau Desktop

We can access student licenses due to the Academic Partnership. Tableau has versions for Mac and Windows. Follow these instructions:

  • Download the latest version of Tableau Desktop here.
  • Copy Tableau Desktop License from campus.
  • Install the software following the instructions in the screen.
  • Update your license in the application: Help menu -> Manage Product Keys

Install Dremio

We will use the community version of Dremio Server. It can be downloaded from this link. Dremio server requires Java to work. Then:

  • Install Dremio using the installer.
  • Start Dremio:
    • [Windows]: Start from the Start Menu.
    • [Mac]: Launch Dremio from Applications. Start Dremio from the Start Menu.
  • You can now navigate to the Dremio UI at http://localhost:9047.
  • Download and install Dremio ODBC for your OS from https://docs.dremio.com/drivers/

FAQ

Is there a Pentaho Release Product Version Matrix?

Yes! You can find it here.

Any recommendation for MySQL SQL syntax?

Yes, check MySQL™ Notes for Professionals book and MySQL Documentation.

How can I have this repository?

Fork it using github and github desktop. Are you interested in how Github works? Start here.