edgi-govdata-archiving/web-monitoring-processing

Use `dateutil` for Date Parsing Everywhere

Closed this issue · 1 comments

In some parts of the codebase, we are using the dateutil package, while in others we are using pandas (!) and in some places, I think we might still be calling strptime(). (Noticed this issue while looking at #356.) We should make audit all our files and make sure to use dateutil everywhere.

Additionally, in places where we know a date is an ISO 8601 date (like parsing responses in web_monitoring.db), we should use dateutil.parser.isoparse() to make sure there’s never any ambiguity.

This requires a little spelunking through the codebase (note: we don’t need to worry about the .ipynb [Jupyter/iPython Notebook] files for this), but should mostly be a straightforward search for date parsing locations and replace with calls to dateutil.parser.parse().

Update: No need to go searching for where this needs to be changed. Here are the places that need updating:

There are some spots where we use strptime() in internetarchive.py, but we are extracting that module into a separate package (see #477), so it should not be changed.