openva/rs-video-processor

New videos are being identified repeatedly

Opened this issue · 1 comments

A new video is "found," and then "found" again a few hours later. Gotta be a caching problem. Figure out the source of the problem, and solve it.

The solution is likely to involve using the files table in the database.

  • create a new Committees class that includes functionality to get the likely committee ID for a given chamber and committee name
  • incorporate that class into rs-video-processor and rs-machine
  • get a video's committee ID at the time that it's identified in rs-machine, and not later
  • create an array of all videos found in the RSS feed, with a counter for the number of time that we've attempted to download each, a timestamp for when we encountered it, the chamber ID (if applicable), and the RSS GUID
  • if a video in the RSS feed is new, queue it
  • if a video in the RSS feed isn't new, but we have no record of it in the database, and it's been more than X hours since it was added to the queue, then add it to the queue
  • if a video in the RSS feed has a counter greater than n, ignore it

Another problem here—the cache file is potentially useless in a frequent-deploy scenario, because it doesn't necessarily survive a new deploy.