propublica/django-collaborative

Collaborate won't import new responses in large dataset from Screendoor

Opened this issue · 1 comments

We added a project with several thousand responses from Screendoor yesterday. I was getting time out error messages when I reimported the data throughout the day, but when I went back into the project, the data would eventually update.

However, today, when I was trying to update some 3,000+ new responses, I'm getting the time out error and it's not actually updating the data.

The number of records is at 6,049; it should be over 9,100.

Screen Shot 2020-02-12 at 11 08 21 AM

The deployment webserver is limiting the maximum number of seconds that any given request can take. When that's exceeded, Collaborate is killed and the webserver returns this page. The reason why it's working when you come back is probably due to the background updater, which has no timeout.

An easy, temporary fix is to increase the request timeout on your deployment webserver.

A proper fix will involve off-loading the import to some background task, or making it run faster in some way. That's what I'll be looking into now.