Doordash Data Science Project

Li Deng 08.08.2019

Github: https://github.com/dengl11/Doordash-DataSicence-Project

From there you can see my step-by-step implementation
Will make it private after being reviewed by Doordash engineers

Part 1

Model Using a simple linear regression model in sklearn
Model Evaluation Train-test split the historical data, and compute MSE
Data Preprocessing
- compute the generated column delivery_seconds from actual_deliverty_time and created_at
- remove the entries where actual_deliverty_time or created_at is nan
- label-encode the categorical column store_primary_category (a better way is to use one-hot encoding)
- choose the features used for training, and remove other columns
- fill nan with the mode in categorical columns, median in numerical columns
Finding From the coefficients of the linear regression model:

num_distinct_items                                : 15.03
total_items                                       : -4.89
total_onshift_dashers                             : -2.47
total_outstanding_orders                          : 2.33
total_busy_dashers                                : -1.93
estimated_order_place_duration                    : 0.96
store_primary_category                            : 0.76
max_item_price                                    : 0.11
estimated_store_to_consumer_driving_duration      : 0.10
min_item_price                                    : 0.02
subtotal                                          : 0.01

num_distinct_items is the dominant t factor for delivery time; the more num_distinct_items , the longer the delivery time, which makes sense;
total_onshift_dashers is also important; the more dashers on shift, the shorter delivery time, which also makes sense
total_outstanding_orders : is also important; the more outstanding orders to deal with, the longer the delivery time

Part 2

Output dataframe to ./output/predictions.tsv

If I have more time…

Further debug to catch out potential bugs

zeinabmohamed/Doordash-DataSicence-Project

Doordash Data Science Project

Part 1

Part 2

If I have more time…