Timestamp RSS and Twitter
DerAlexmeister opened this issue · 3 comments
DerAlexmeister commented
Add a timestamp in the RSS-Feed and Twitter data which will be put into kafka topics.
p2h5 commented
If I am appending an event with the timestamp, when it got scraped, there is a good chance, that we are getting the same event twice.
e.g. if we are scraping the last thirty tweets of a twitter account every day, the chances are high, that we are getting the same tweet on both days, but with different timestamps.
Maybe we can check by the name, if there is already such an event and get rid of the duplicate, if we scrape one?
DerAlexmeister commented
We should discuss this.
DerAlexmeister commented
Event sourcing. Might this be an option.