Potential deadlock caused by concurrent sync calls
wangvsa opened this issue · 0 comments
wangvsa commented
Describe the problem you're observing
I'm observing some TIMEOUT errors when trying to stage-in many files simultaneously.
It seems that concurrent unifyfs_sync()
may cause deadlock on the server side.
After some investigations, I found the server side is blocking at the process_pending_sync
call in this case:
client A on server 0 --> write/sync file 1 --> owner is server 1
client B on server 1 --> write/sync file 2 --> owner is server 0
UnifyFS/server/src/unifyfs_service_manager.c
Lines 1479 to 1483 in 58ece44
@MichaelBrim Is this the cause? Any idea how to fix this?