/HadoopBI

BI solution based on Hadoop

Primary LanguagePython

BI on Hadoop

In this project, the WideWorldImporters sample data warehouse inlcuding the SSIS package was recreated and redesigned for processing it with Hadoop (HDInsight). The corresponding Azure SQL sample database serves as the main data source. Further is included a Phyton script that enhances the solution with data coming from the Twitter API. The project comprises files including source code of Sqoop, Hive, Oozie and Python. The XML workflow files can simply be imported into a new Oozie project.

WideWorldImporters example: https://github.com/Microsoft/sql-server-samples/releases/tag/wide-world-importers-v1.0