This project consists of four individual R scripts, each can be run on their own to produce the plots required for this assignment, in png file format. The scripts are:
- plot1.R
- plot2.R
- plot3.R
- plot4.R
Each script will produce a corresponding png file labeled plot1.png, plot2.png, plot3.png, and plot4.png, respectively. The files will be generated in the current working directory.
Each R script will download the required dataset from the following URL and unzip it. The downloaded file, and unzipped dataset will be placed in the current working directory.
- Dataset: Electric power consumption [20Mb]
Each script will check for the existence of the downloaded dataset (i.e., will look for the extracted text file). If the file is not found, the script will download and unzip the dataset.
Each script will load the entire file into R. The data is loaded with the read.table function, which is given arguments to use the Header provided, not to treat strings as factors, and to replace '?' values with the R standard for missing data, 'NA'.
Data is subset using the subset function, selecting for only those dates used in this assignment, 2007-02-01 and 2007-02-02. An additional column is added to the subset, called TimeStamp, which is populated with both the date and time field from the original dataset using paste() and strptime(), to combine the fields and convert the result to an R DateTime format.
Each plot is generated by using either the hist() or plot() base plotting functions. Each plot is generated as a png file using the png() function. All output files are saved to the current working directory.