-Language: Python3 -Liblary: Pandas
- I decided to use OOP because this way I can reuse methods already defined and implements for a given solution
- Example of this methods are reading data and checking for Empty value
- For working with data I decided to use pandas liblary because it provide simplicity while working with large dataset
- First I had to load the data and factor in for Input/Output Error
- Next check to see if there are any missing values in the dataset
- Understand how the dataset is or have a look at it before sorting
- Since we only want data for Sunday then we hav to filter out the dataset and drop the rows that contains the data where the day is not Sunday
- In the already filtered dataset find the top seven most traveled routesand rank them accordingly in descending order.
- Finaly calculate the average of the top seven routes
- Make use of the methods defined in solution one to load the dataset and check if there are any empty values
- If everything is OK filter the dataset according to the route and drop all the rows where the column of travel_from is not Kijauri
- Next filter out the remaining dataset and drop all the rows where the travel_time was after 0730hrs
- The resultant dataset will only contain the values for travel duration that occured before 7:30
- Calculate the probability of the mean of travel being a shuttle because the dataset has both bus and shuttle
- Make use of methods in solution one for loading the dataset and checking if any of the fields are empty
- Get all transaction receipt that have letters MK in then in that order
- Get the index of letter M and add 2 more index to get the next letter after index of K
- store this letters in a separate list
- Loop through the list and hold the letter that appears the most times
- Due to the fact that it is appearing most times then it is the most probale letter in the tri-gram
- Make use of methods in solution one for loading the dataset and checking if any of the fields are empty
- Also to avoid repetion use the method in solution two to filter out the dataset according to the specified route
- Next get the no of rows where the payment method used is Mpesa:Assumption made that this is the mobile money means of payment
- Next get the count of all rows of the dataset
- To get the percentage divide the numbet of payments where the method of payment is Mpesa and the total number of rows and multiply the answer by 100
- If the percentage is over 50 percent then recomend YES for having mobile payment else recomend No.