openclimatefix/Satip

Add timeout on Data Tailor

Closed this issue · 6 comments

Detailed Description

We can get into a infitny loop if we dont watch out.

https://github.com/openclimatefix/Satip/blob/main/satip/eumetsat.py#L554

Context

  • ECS fargate services get stuck and we end up manually cleaning them, or paying lots

Possible Implementation

  • add if time < 10 minutes in this whitle loop aswell

If we do have the timeout, we should also delete the customization immediately, otherwise risk it being left and taking up some of the 20GB of space we have for customizations

Yes, that would be a good idea

At the start of each run, can we delete all the old customizations? Or customizations older than 15 mins? This might be more fail safe

It can, just the actual deletion takes a bit of time. I think we currently run a cleanup job every 6 hours or so? Could be worth reducing that? But it seems like there might also have been a change to how to delete customizations, as in #162

@jacobbieker if I add a timeout thing here, and let you tackle #162 ?

Yep, sounds good!