Candidate: Quang-Truong Nguyen
This is my submission for the Homebase Take-Home Assignment. There are 5 tasks in total.
The version of Python used is 3.10.12. The version of MariaDB used is 10.6.12.
The solution for this task is in the task1/
directory.
average_age.py
: The script that calculates the average age of all users in thedata.csv
file.data*.csv
: The data files used for testing the script.
To run the script:
- Change the variables
INPUT_CSV_FILE
andDELIMITER
inaverage_age.py
to the desired values. - Run the script with
python3 average_age.py
.
The solution for this task is in the task2/
directory.
schema.sql
: The SQL script that creates the tables and relationships for the schema.diagram.png
: The ER diagram for the schema.
To create the schema:
- Create a database in MariaDB.
- Run the script with
mysql -u <username> -p <database_name> < schema.sql
. - The schema should be created in the database.
The solution for this task is in the task3/
directory.
main.py
: The script that scrapes the data from the website.scraper.py
: Scraper class for scraping the data from an URL of multiple products.product_parser.py
: Parser class for parsing the data from the HTML of a product page.cache_schema.sql
: The SQL script that creates the tables and relationships for the cache schema.
To run the script:
- Setup a virtual environment with
python3 -m venv venv
. - Activate the virtual environment with
source venv/bin/activate
. - Install the dependencies with
pip3 install -r requirements.txt
. - Install
sqlite3
. - Copy the
.env.example
file to.env
and modify the variables to the desired values. - Modify the
URL
variable inmain.py
to the desired URL. - Run the script with
python3 main.py
.
The solution for this task is in the task4/
directory.
data_gen.py
: The script that generates the data for the nested set model.hierarchical_to_nested_set.py
: The script that converts the data from hierarchical model to nested set model.retrieve_parent_child_relationship.py
: The script that retrieves the parent-child relationship from the nested set model.schema.sql
: The SQL script that creates the tables and relationships for the schema.benchmark.txt
: The output after benchmarking the performance of the above scripts.
There are also 2 example data files in the task4/
directory:
small_example.csv
: The data file with 14 nodes.large_example.csv
: The data file with 5018 nodes.
To run the script:
- Setup a virtual environment with
python3 -m venv venv
. - Activate the virtual environment with
source venv/bin/activate
. - Install the dependencies with
pip3 install -r requirements.txt
. - Install
sqlite3
. - Create tables with
sqlite3 data.sqlite < schema.sql
. - Modify the
MAX_DEPTH
andMAX_CHILDREN
variables indata_gen.py
to the desired value. - Populate the database with
python3 data_gen.py
. - Convert the data to nested set model with
python3 hierarchical_to_nested_set.py
. - Retrieve the data from nested set model with
python3 retrieve_parent_child_relationship.py
.
The solution for this task is in the task5/
directory.
schema.sql
: The SQL script that creates the tables and relationships for the schema.procedure.sql
: The SQL script that creates the stored procedure.
To create the schema:
- Create a database in MariaDB.
- Run the script with
mysql -u <username> -p <database_name> < schema.sql
to create the tables. - Run the script with
mysql -u <username> -p <database_name> < procedure.sql
to create the stored procedure. - The schema should be created in the database.