This sprint will provide a concentrated data feed of events related to MGMWERX. On delivery, the system will allow for users to pull calendar events from our selected sources. The system will constantly pull data from select sources through a series of workers.
For new ReadME users, here's a basic cheat sheet and here's more detailed version.
Each team has identified tasks to reach completion.
Will operate a python application that load from a local config file.
- On boot, the config options will load from a static file
- On itnerval, the scraper will perform the scraping
- After all sites are scraped, the object of calendar items are passed to Core Engine
To receive the scraper work at a common interface.
To provide Object-Relational Mapping
- Detect updates
- Detect duplicate sources
Establishing a MongoDB instance. MongoDB collections: Location, Events, Source.
- (N) - Designates nullable
- (U) - Designates unique
Location
Column Name |
---|
location_id (U) |
name |
address |
second_line (N) |
city |
state |
zip |
Events
Column Name |
---|
id (U) |
title |
starts_at |
ends_at |
description (N) |
website_url |
timestamp |
Source
Column Name |
---|
website_url (U) |
The web host is a server that pulls from the database with known query parameters.
Any cleaning of user input/URL input should happen here.
A helloworld website experience for testing purposes.
This is bewing defined in a branch until ready for release. https://github.com/mgmwerx/2019-Coding-Sprint/tree/dev-icd