/Humdity-Classification

Weather Data Classification using Decision Trees

Primary LanguageJupyter Notebook

Humidity-Classification

Humidity using Decision Trees


Each row in daily_weather.csv captures weather data for a separate day.

Sensor measurements from the weather station were captured at one-minute intervals. These measurements were then processed to generate values to describe daily weathe. Since this dataset was created to classify low-humidity days vs. non-low-humidity days (that is, days with normal or high humidity), the variables included are weather measurements in the morning, with one measurement, namely relatively humidity, in the afternoon. The idea is to use the morning weather values to predict whether the day will be low-humidity or not based on the afternoon measurement of relative humidity.

Each row, or sample, consists of the following variables:

  • number: unique number for each row
  • air_pressure_9am: air pressure averaged over a period from 8:55am to 9:04am (Unit: hectopascals)
  • air_temp_9am: air temperature averaged over a period from 8:55am to 9:04am (Unit: degrees Fahrenheit)
  • air_wind_direction_9am: wind direction averaged over a period from 8:55am to 9:04am (Unit: degrees, with 0 means coming from the North, and increasing clockwise)
  • air_wind_speed_9am: wind speed averaged over a period from 8:55am to 9:04am (Unit: miles per hour)
  • max_wind_direction_9am: wind gust direction averaged over a period from 8:55am to 9:10am (Unit: degrees, with 0 being North and increasing clockwise)
  • max_wind_speed_9am: wind gust speed averaged over a period from 8:55am to 9:04am (Unit: miles per hour)
  • rain_accumulation_9am: amount of rain accumulated in the 24 hours prior to 9am (Unit: millimeters)
  • rain_duration_9am: amount of time rain was recorded in the 24 hours prior to 9am (Unit: seconds)