Thread deadlock in BalancedMetricResolver
billoley opened this issue · 0 comments
billoley commented
- The ServiceCacheListener cacheChanged handler grabs the balancerLock.writeLock and then calls writeAssignmentsToHdfs
- The RebalanceTimer periodically calls writeAssignmentsToHdfs
- In writeAssignmentsToHdfs, we first acquire the assignmentsIPRWLock.writeLock
- In writeAssignmentsToHdfs, we then acquire the balancerLock.readLock
if (1) ServiceCacheListener cacheChanged acquires the balancerLock.writeLock at the same time that the (2) RebalanceTimer acquires the assignmentsIPRWLock.writeLock then (1) will be stuck trying to acquire the assignmentsIPRWLock.writeLock while (1) will be stuck trying to acquire the balancerLock.readLock.
Solution is in ServiceCacheListener cacheChanged to move the call to writeAssignmentsToHdfs outside the scope of the balancerLock.writeLock. It's unnecessary and causes this issue.