This repository contains work done for the Databases class at Wrocław University of Science and Technology.
We were supposed to design and implement a simple database which would help SwimmingPools with data maintenance and information flow.
Repository contains full database dump which can be used with ORACLE DATA PUMP tool to create your own instance.
I have also included implemented data generators written during the process of implementation.
The last thing you can find here are some fancy triggers and PL/SQL procedures which were used to meet given project specification.
-
Oracle database 11G express edition
-
docker
-
Python - Faker, Numpy, Pandas
If you are a polish reader I highly recommend checking the full project documentation which was submitted to the lecturer.
First you need to make sure that you possess all of the required dependencies and plugins.
pip install -r requirements.txt
docker run --name oracleDb \
--shm-size=1g \
-p 1521:1521 -p 8080:8080 \
-e ORACLE_PWD=Test \
-v ~/docker/oracle-xe:/u01/app/oracle/oradata \
oracle/database:11.2.0.2-xe
docker start oracleDb
docker exec -it oracleDb bash
sqlplus username/password@SID
docker exec -it oracleDb bash
mkdir export && chmod a+rwx ./export
docker cp ./full_dump.dmp oracleDb:/export/full_dump.dmp
sqlplus / as sysdba
CREATE DIRECTORY export AS '/export/';
GRANT read, write ON DIRECTORY export TO username;
impdp username DUMPFILE=export:full_dump.dmp FULL=YES LOGFILE=export:full_imp.log
python CSVCreator.py
docker cp databaseData/. oracleDb:/databaseData
docker cp Ctls/. oracleDb:/Ctls
docker exec -it oracleDb bash
cd ./Ctls && chmod +x ./loadData.sh && ./loadData.sh
DataLoaders folder contains all of the generators which are able to create given number of rows for each table in database.
Data is returned as CSV files which are then loaded using the SQL*Loader tool provided by oracle. This solution ensures maximal speed and optimization.
Most of the data is pulled from faker library which posses language specific data sets. Some of the easier concepts are simply generated from math functions or calculated. Pandas was used to easily convert python lists into data frames and then write to csv files. Numpy was used to generate random float data used in currency fields and time.
SQL folder inside dataLoadersCtl folder includes some of the triggers and procedures which are responsible for ensuring data integrity in the database.