Kong/lua-resty-healthcheck

Is there any plans to use 2.0 version in Kong?

Opened this issue · 2 comments

As I understand version 2.0 of lua-resty-healthcheck uses shared dict to create health checker timer in only one worker process(system wide timer). The previous version 1.6.2 creates dedicated health checker timer per worker process. In some cases, this causing too much stress on upstream services when we run large number of kong pods.

Is there any plan kong to upgrade the version 2.0 to make use of kong node level timers?

Hi @surenraju-careem
Thank you for this remark! We do not plan to include the changes introduced 2.0.0 into kong, mainly because in kong we stopped using lua-resty-worker-events. Apart from that there were other breaking changes introduced to this library that made it painfully difficult to bump the version up in kong.

Therefore we decided to discard 2.0.0 from future development and promote 1.6.3 as the new master and 3.0.0 release since it was there where the main development happened.

When it comes to the issue you're experiencing - it sounds like those changes might be introduced to the next 3.x version - but I'd need other folks to confirm. @bungle , @Tieske - do you think we could optimize the health checker to not cause too much stress on upstreams in the same manner we optimized it for 2.0.0?

Health-checks were always synchronized across the workers (only 1 worker would execute a specific check). This still required a timer for each healthcheck in each worker though. Problems occured when too many upstreams were created, causing systems to run out of timers.

This has been refactored to use way less timers. I don't know the exact details, but the problem was solved. Maybe @locao can shed some light on that.