Python scripts to ingest Open Weather Map data into MongoDB
ingest_all.py
takes in all files and creates DB from scratch
ingest_incremental.py
takes in new files and adds to DB excluding duplicates measurements
db.clean.find({sensorTime:{$lt:ISODate("2016-02-25T00:00:00.000Z")}})
Ingest incrementally, searching for duplicates with each insertCreate separate collection with coords, names and IDs of towers on first ingestAdd in time zones of towers (reverse geo-coding? or time-zone API)Move/rename files after processingWhen ingesting incrementally, update downsampled values incleanDaily
by understanding which time periods are updatedAdd a column tocleanDaily
that shows how many measurements average is based upon? i.e.df.n=df.resample('12h',how='count')
- When ingesting incrementally and updating overlapping values check
- Overlapping values only overlap with one last aggregated time period
- Make sure duplicate measurements are notbeing added