mitodl/open-discussions

Troubleshoot Elasticsearch timeout issues

mbertrand opened this issue · 1 comments

Steps to Reproduce

Elasticsearch timeouts were often occurring during a reindex. This was temporarily resolved by increasing the timeout from 10 seconds to 30. But it would be good to know why some requests need > 10 seconds to complete. Maybe the ES server was getting overwhelmed by too many requests (there are many for contentfiles in particular), or some requests are particularly large and take extra time to process.

Expected Behavior

Run a reindex on courses or all data, with a default timeout of 10 seconds, and complete successfully.

Actual Behavior

Reindex fails with intermittent timeouts, unless the default timeout is raised to 30 seconds.

Increasing ELASTICSEARCH_DOCUMENT_INDEXING_CHUNK_SIZE from 20 to 75 on RC seemed to improve the odds of success with a default timeout of 10 seconds, but there was still a timeout error during 1 out of 4 reindex runs.