I am Saurabh Maheshwari, pursuing MS in Statistics and MS is transportation engineering at UC Davis. I completed my undergrad at Indian Institute of Technology Bombay, where my passion for data science evolved. Currently as a graduate researcher, I am working on parametric/non-parametric estimation of travel demand probability density distribution based on observed flow in the network. Apart from this, I spend a lot of time doing online courses related to machine learning, deep learning and data analysis (certifications can be found on LinkedIn), along with some cool self projects in the free time to develop hands on data science skills. Here is the link to my LinkedIn profile: https://www.linkedin.com/in/saurabh-maheshwari-240396/

Some of the skills I have developed are:
Interests: Predictive Modeling, Machine Learning Algorithms, Data Mining and Visualization
Tools: Python, R, MATLAB, Microsoft Azure (Machine Learning Service Workspace, Databricks, Container Instances), SQLite, Keras, Fast.ai, TensorFlow, Google Colabs, scikit-learn, Shiny, Leaflet, folium, bokeh, ggplot2

The repository contains my work at UC Davis and IIT Bombay, the descriptions of which are as follows:

  1. Machine Learning - Contains codes, reports and publications for my machine learning projects completed so far.

  2. Computational Statistics - Course work containing codes as well as approaches to various statistical problems on topics such as genetic algorithms, bootstrap, sampling, etc.

  3. Landslide data analysis (Heroku, Flask, Bokeh) - Contains interactive bokeh graphics, rendered on heroku using flask. It is a fun part of my self learning endeavor

  4. Network graph automation - My work as a graduate researcher at UC Davis. I automated the process of network graph formation by coupling freeway data queried using OpenStreetMap API with sensors from PeMS database in R using packages like Osmar, Leaflet and dplyr. Then, successfully employed the algorithm to generate adjacency matrix, sensor-link relationship, and origin-destination pairs from raw data consisting of 12351 nodes, 2646 links and 400 sensors, saving hours of manual effort.

  5. Shiny Leaflet integration - Integrated Shiny and Leaflet packages in R to create user specific online query for the Houston ship channel data. App facilitates data exploration of 117K records of 10 different emissions by mapping features such as locating highest emission points, tracking high emission ships, etc. instantaneously.

  6. Certificates - I earned on completing online courses on deep learning, machine learning, algorithms and data analysis