- Java
- Python3
- Apache Hadoop
- Apache Hive
- Python development and PyQt packages (dependent on OS)
Recommended: follow these instructions for setup https://www.edureka.co/community/1828/installing-hive-hadoop-in-vm
Tested on Ubuntu 20.04
with Hadoop installed in single cluster mode and Hive installed on top.
If not already in code directory:
cd code/
Then run:
pip install -r requirements.txt
Now you can run the system:
python3 main.py
Recursive queries (or also called Transitive Closure Queries) are a class of queries that first build an initial answer state (say R), and then use this answer state to execute more queries. The results from each execution get augmented to the answer state. The execution continues until no more records are augmented to the answer state. Here is an example of how a recursive query looks like.
This project is to
- Build a good understanding on the class of recursive queries and how they work
- Implement a high-level interface for end-users to enter and submit a recursive query.
- The interface should parse the query, and decide on the sequence of queries it will generate. And then submit these queries (one at a time) to the Hive engine.
- Hive does not see the recursive queryitself; it just sees one isolated query at a time. The higher-level interface is controlling the execution of the entire recursive query.
The solution was implemented to support command line queries and a graphical user interface. The interface is what will be detailed in this section as it is an example implementation of howto leverage the functionality. The interface was completed with PyQt5
following a Model-View-Controller design paradigm. The controller section depicts the interaction with the Hivequery tool. The interface is able to accept both recursive and non-recursive commands enteredinto the multi-line text box. The user may then click Execute!
to process the commands. Therecursive results will be saved in a.tmp/directory
, however, this can be easily modified.
The user interface of the application
The application is processing user-input recursive query written in SQL
To view the final report for this project, please navigate to this page
Shijing Yang syang6@wpi.edu | Davis Catherman dscatherman@wpi.edu