chriseyre2000/contentful-to-neo4j

Fails to load entries if more items than batch size

Closed this issue · 7 comments

This failed testing on a large dataset.
Problem can be recreated by setting the batch size to a small value (Say 50).

This appears to be a rate limiting issue.

Already fixed a type issue - the contentful batch size was a string if read from an environment variable which caused problems (skipping "0500" items). I have added a rate limiter to slow down the processing but we currently don't have a retry mechanism.

I have also noticed that the contentful client does retry on network failures but not on rate limits (this needs exponential back off or you will cause more problems than it fixes). The contentful client does not even look at the rate limit headers that are returned. It looks like I will need to move to an api approach instead.

The problem comes when transitioning from assets to events. Currently they have distinct rate limits and get throttled.

I have traced the error to an oversize entry - if the response is too big it fails with an error. I may need to adjust the batch size of entries distinct from assets (which may need to use a fixed default).

This may require a dynamic batch size - on failure retry with a smaller batch size (until the initial problem area has passed then restore).

I am going to close this issue and raise another with the specific problem.

Did not open distinct issue as it has now been fixed.