Non-ARM x86-64 desktop CPUs compatibility
Closed this issue ยท 33 comments
Thanks for the neat code, I was looking for portable synchronization mechanisms for my concurrent networking library, and it seems that I finally found it, but I have only one question if you don't mind. ๐ธ
Is it safe to use the rwsync implementation on non-ARM x86-64 desktop CPUs with multiple threads (2-3) and wr/rd functions with a single synchronizer simultaneously? My basic tests show good results on AMD FX-4300, but I didn't test it on Intel yet. Any caveats that I should be aware of?
Yes rwsync should be safe to use on all x86-64 processors. I do most of my testing on ARM (AArch64) processors but since I depend mostly on the compiler, there should be less risk of target specific problems.
Please speak up if you are missing some scalable datatype. I love to invent new algorithms.
Thanks, that's great! Currently, I'm using the rwsync to access shared structures for read/write operations, but I'm also planning to integrate SPSC ring buffer to dispatch transport messages from network thread to main thread since this approach works quite well in managed environment for me, and it's very convenient.
I think this library provides everything that I currently need thanks for your work. Cheers. ๐บ
@WonderfulVoid Maybe you can help me to solve some design problem that I encountered recently...
I'm writing a networking library where a user should register a callback function which is triggered from a separate thread when network events occur. Before the callback is triggered, a synchronizer acquired for logic around shared data. In the library's public API, I provide functionality that also acquires the same synchronizer for read/write operations for the same shared data. Everything is okay while this functionality used outside of the callback, but if a user is going to use public functions within the callback for a logic chain, then a deadlock will occur because the synchronizer already acquired.
I'm trying to find a solution to solve this transparently for a user. A rough way is to add a boolean parameter to public API which should indicate when a function is called within the callback and the synchronizer shouldn't be acquired.
Any ideas?
I think a sort of recursive SRW which tracks threads that acquired a synchronizer would be great. ๐ค
Found the implementation on top of critical sections which already support recursive access but they still added it for some reason.
It's possible to add a similar owner check to the rwlock/rwsync implementation? I'm afraid that there's no portable way to do this... ๐ค
Let me check out these things.
How are SRW locks different from vanilla reader/writer locks?
It's the same things essentially that doesn't support recursive access unlike critical sections for example.
I think I've found the way to solve this: after acquiring a synchronizer in the callback thread and triggering the callback itself, I'll set a variable internally to check in public API if it was acquired there or not. ๐
But yea, I think it would be nice to have a portable recursive synchronization mechanism.
Have a look here: d6f4921
Currently a thread cannot acquire a synchroniser for read when the same synchroniser has already been acquired for write. Reason #1 is for implementation simplicity (acquire_rd() currently waits until a write in progress has completed but this behaviour can be changed). Reason #2 is that if a write is in progress (by this thread), how do we know the state of the protected data? Perhaps it is only half updated. We don't want to allow "torn" reads. And we cannot wait until the write has completed because we would be waiting for ourselves (deadlock). But a thread can call rwsync_r_acquire_rd() recursively and rwsync_r_acquire_wr() recursively.
If you like this, I can do the same for the rwlock.
Awesome, thanks for this!
Yes, if you can do the same for the rwlock it would be great. โญ๏ธ
Fantastic, thanks! ๐ป
I am considering moving the owner field to a per-thread variable (just like the count variable). Then p64_rwlock_r_t would be identical to p64_rwlock_t (is this good or bad). The implementations could then also be merged and recursive mode always enabled. This would however require that the thread ID is always specified in the acquire_rd/wr calls... PROGRESS64 has no internal concept of thread ID.
The implementation has not been stress tested, just some basic smoke testing using the rwlock_r example (including some negative cases that lead to program abort, I removed those test cases afterwards).
Yea, no worries, I'll test it very soon. ๐
@WonderfulVoid Thanks, everything works great!
Here's a portable (I believe) way to get thread ID:
#ifndef UTILS_GETTID_H
#define UTILS_GETTID_H
#if defined(_WIN32)
#ifndef __MINGW32__
typedef DWORD pid_t;
#endif
#else
#include <unistd.h>
#include <sys/syscall.h>
#endif
#ifdef __cplusplus
extern "C" {
#endif
#if defined _WIN32
inline static pid_t gettid(void) {
return GetCurrentThreadId();
}
#elif defined __APPLE__
inline static pid_t gettid(void) {
return syscall(SYS_thread_selfid);
}
#else
inline static pid_t gettid(void) {
return syscall(__NR_gettid);
}
#endif
#ifdef __cplusplus
}
#endif
#endif //UTILS_GETTID_H
Thanks, I will see if I can add this to a new porting layer.
OK I have something working. The API will change, the tid parameter will be removed from the acquire calls.
It works great, thanks! I only wish that we could call multiple reads simultaneously after the write progress completed, that would just perfect. โ๏ธ
Found this article, so he's using an additional critical section on top to achieve this.
I don't understand your comment above.
I think the current design has a problem when a thread tries to acquire different locks. The per-thread rwl_count needs to be per-lock as well. The recursive rwsync_r does not have this problem, the recursive count is only used for write access so is associated with the synchroniser, not with the thread.
I have an idea on how to solve the problem though. We will need a per-thread stack of acquired rwlocks.
Looking forward for this. I'm happy to test any implementation. ๐
It works great, thanks! I only wish that we could call multiple reads simultaneously after the write progress completed, that would just perfect.
Still don't understand this comment. Is it still relevant?
Yes, due to:
//Not allowed to call acquire-read when the lock has already been acquired
//for write, protected data is in unknown state and the call cannot block
//waiting for the write to complete
//Recursive acquire-read calls allowed for the same rwlock
void p64_rwlock_r_acquire_rd(p64_rwlock_r_t *lock);
I'm getting rwlock_r: acquire-read after acquire-write
when read trying to acquire a synchronizer if the write is in progress.
Now you can acquire_rd after acquire_wr on the same lock. It is up to the application to handle the possibility of reading the protected data in the middle of its own update. You have been warned!
Thanks, I'll do a couple of tests. ๐ฌ
It works as intended, great job. ๐
Thanks. It was fun.