- Python 2.7
- PostgreSQL 9.6.8
- Original, tiny sample data
- Sample data generator
- Sample generated data for 100 users to test queries
- "Setup" query to generate schema and populate with sample data
- Main query file with view to answer posed questions
With a clean installation of postgres and a ~/.pgpass file configured for a user with administrative privileges, one should be able to execute all of the queries via:
psql < setup.sql
psql < queries.sql
The answers to the specific questions can be found by "SELECT * FROM"-querying the views count_unique_device_ids
, tpu_avg_purchase_amount
and tpu_average_time_delta
.
I did not have time to finish these exercise, but I did complete the cleaning tasks.
This is an exploratory exercise. To make this code more maintainable, I should at least:
- Add a requirements.txt/Pipfile or similar file stating dependencies (pandas, numpy, matplotlib, requests, and some base libraries),
- Export the .ipynb notebooks to pure .py Python files, and get rid of most comments,
- Add complete documentation for most functions and
- Declare unit tests on key methods, like the data download script from Alphadvantage.