The objective of this project is to explore the bike sharing data and built a model that can accurately predict the number of bikes rented in a given hour.
This project uses several data analysis techniques such as exploratory data analysis, data preprocessing, feature engineering, and model selection. The project also uses several Python libraries such as NumPy, Pandas, Matplotlib, Seaborn, and Scikit-Learn.
The Jupyter Notebook contains detailed explanations and comments for every step of the analysis. Feel free to modify or extend the project as you wish.
day.csv have the following fields:
- instant: record index
- dteday : date
- season : season (1:spring, 2:summer, 3:fall, 4:winter)
- yr : year (0: 2018, 1:2019)
- mnth : month ( 1 to 12)
- holiday : weather day is a holiday or not (extracted from http://dchr.dc.gov/page/holiday-schedule)
- weekday : day of the week
- workingday : if day is neither weekend nor holiday is 1, otherwise is 0.
+ weathersit :
- 1: Clear, Few clouds, Partly cloudy, Partly cloudy
- 2: Mist + Cloudy, Mist + Broken clouds, Mist + Few clouds, Mist
- 3: Light Snow, Light Rain + Thunderstorm + Scattered clouds, Light Rain + Scattered clouds
- 4: Heavy Rain + Ice Pallets + Thunderstorm + Mist, Snow + Fog
- temp : temperature in Celsius
- atemp: feeling temperature in Celsius
- hum: humidity
- windspeed: wind speed
- casual: count of casual users
- registered: count of registered users
- cnt: count of total rental bikes including both casual and registered