Problem with dates for EPL 2014 season
aashanand opened this issue · 9 comments
I'm seeing a problem with the formatting of dates for the 2014-15 EPL season. The majority of the entries in the table engsoccerdata2 have dates in the format "YYYY-MM-DD" which is easy to convert to class Date in R. However, the 2036 records for which Season=="2014" are entered as 5 digit numbers (days since 01-01-1990), which tends to happen in Excel.
data("engsoccerdata2")
dim(engsoccerdata2)[1]
[1] 190096sum(grepl("-",engsoccerdata2$Date))
[1] 188060sum(engsoccerdata2$Season=="2014")
[1] 2036sum(engsoccerdata2$Season=="2014"&grepl("-",engsoccerdata2$Date))
[1] 0head(engsoccerdata2$Date)
[1] "1888-12-15" "1889-01-19" "1889-03-23" "1888-12-01" "1888-10-13" "1888-12-29"tail(engsoccerdata2$Date)
[1] "16557" "16557" "16557" "16557" "16557" "16557"
Discovered this issue when I was calling as.Date(engsoccerdata2$Date) and it was generating NAs.
I don't think this is an excel issue as these data were never in an excel format, but I can see this too. I will fix later today.
Actually, I don't think these are excel dates. For instance, trying to convert them to dates using the standard, as.Date(16557, origin="1899-12-30")
way of converting excel dates to R dates does not give the correct dates for each number. I can fix these though.
Hi jalapic,
I looked at this with fresh eyes this morning. It seems to work if you do as.Date(16557,origin="1970-01-01") which is definitely not an Excel origin.
as.Date(16557,origin="1970-01-01")
[1] "2015-05-02"
The last 5 records are for matches played on May 02, 2015 according to Google.
Thank you so much for fixing this - issue closed!
Does this work for you? I'm still having some issues with it.
I have fixed the "errors" (which actually don't appear in the original csv in my R project. When I open the raw data file on my machine from either the csv in data-raw
or the .rda file in data
, it is completely fine. I have uploaded them to this repository and when I re-install I still reproduce the error. It's very confusing.
I reproduced the error after reinstalling, but after I restarted RStudio it was fine.
Yes, me too. Not sure why- but thanks for bringing it to my attention. It seems to work now. Hopefully soon I'll get time to update the other datasets with 2014-15 data.
That would be great! Saw the issue about Champions League data. I'm sure that would be a huge hit.