This is the repo with the solution to the challenge for the Business Intelligence Analyst role at door2door. You can check the accompanying presentation in the following link.
Follow this instructions if you use conda.
- Clone the repository on your local machine
https://github.com/jlcoto/door2door_bi.git
- Create the environment by running
conda env create -f environment.yml
- Activate the environment
source activate environment
- Run a Jupyter Notebook on shell
jupyter notebook
If using pip
- Activate a virtual environment
-Run
pip install -r requirements.txt
Please see the presentation in the following link.
One of the main KPI of door2door's allygator shuttle performance is the relationship between estimated time of arrival (ETA) versus actual time of arrival (ATA). Using the data provided develop the following:
- The indicator that would give the best measurement of this relationship (measurement, what it says, how often it needs to be looked at etc.).
- Indicate which parts of the company are interested in this indicator and what info can they infer from it.
- Show what conclusions, remarks and patterns you could infer from the provided data.
- Create a presentation that communicates your approach and findings to non-technical stakeholders within the company.
The language of choice for data analysis at door2door is Python, though we don't mandate the tools for the solution. Choose the tools and language that you think are best to accomplish the task. It's important though that the presented analyses can be reliably reproduced, and that you're able to convey the results of your analysis. You can always choose more than one way to present those insights.
Whenever questions arise during the task please don't hesitate to ask us.
Two data sources are provided, the bookings from a event stream as a json file and the tasks from a database as a csv file.
-
Please clone the repository to your local machine via:
git clone https://github.com/door2door-io/bi-code-challenge.git
or download a zip archive of this repository here.
-
Fulfill the given task.
-
Submit your solution:
Please submit your solution in the form of a publicly accessible git repository. Please make sure to submit the full git commit history with the project and please include instructions for running your code where necessary.
Alternatively, you can submit a zip archive of your project via email.
It is matters to us to learn how you interact with the data and how you explain it to other people in the company. It is more important to have understandable code and presentation than a complex algorithm. The same counts for the presentation, it is important that your points are conveyed clearly to the target audience (substance over style).
The criteria we are looking for are the following:
- Presentation: Are the conclusion clearly described? What discussion points are raised?
- Documentation: Is the project and the code properly documented?
- Correctness: Is the task solved? If there is anything missing, is the reason why documented?
- Code: What technologies were used, and do they fit the tasks? Is the code understandable and maintainable?