Non-ARM x86-64 desktop CPUs compatibility

Question

Non-ARM x86-64 desktop CPUs compatibility

Closed this issue 6 years ago · 33 comments

Thanks for the neat code, I was looking for portable synchronization mechanisms for my concurrent networking library, and it seems that I finally found it, but I have only one question if you don't mind. 😸

Is it safe to use the rwsync implementation on non-ARM x86-64 desktop CPUs with multiple threads (2-3) and wr/rd functions with a single synchronizer simultaneously? My basic tests show good results on AMD FX-4300, but I didn't test it on Intel yet. Any caveats that I should be aware of?

WonderfulVoid commented 6 years ago

7650d81

👍1

WonderfulVoid commented 6 years ago

4f09a39

WonderfulVoid commented 6 years ago

7dd6944

👍1

WonderfulVoid commented 6 years ago

f409a74

Answer 1 · 2019-03-09T21:39:58.000Z

Yes rwsync should be safe to use on all x86-64 processors. I do most of my testing on ARM (AArch64) processors but since I depend mostly on the compiler, there should be less risk of target specific problems.

Please speak up if you are missing some scalable datatype. I love to invent new algorithms.

Answer 2 · 2019-03-10T04:21:53.000Z

Thanks, that's great! Currently, I'm using the rwsync to access shared structures for read/write operations, but I'm also planning to integrate SPSC ring buffer to dispatch transport messages from network thread to main thread since this approach works quite well in managed environment for me, and it's very convenient.

I think this library provides everything that I currently need thanks for your work. Cheers. 😺

Answer 3 · 2019-03-12T01:31:53.000Z

@WonderfulVoid Maybe you can help me to solve some design problem that I encountered recently...

I'm writing a networking library where a user should register a callback function which is triggered from a separate thread when network events occur. Before the callback is triggered, a synchronizer acquired for logic around shared data. In the library's public API, I provide functionality that also acquires the same synchronizer for read/write operations for the same shared data. Everything is okay while this functionality used outside of the callback, but if a user is going to use public functions within the callback for a logic chain, then a deadlock will occur because the synchronizer already acquired.

I'm trying to find a solution to solve this transparently for a user. A rough way is to add a boolean parameter to public API which should indicate when a function is called within the callback and the synchronizer shouldn't be acquired.

Any ideas?

Answer 4 · 2019-03-12T02:24:49.000Z

I think a sort of recursive SRW which tracks threads that acquired a synchronizer would be great. 🤔

Answer 5 · 2019-03-12T03:45:34.000Z

Found the implementation on top of critical sections which already support recursive access but they still added it for some reason.

It's possible to add a similar owner check to the rwlock/rwsync implementation? I'm afraid that there's no portable way to do this... 🤔

Answer 6 · 2019-03-12T12:21:37.000Z

Let me check out these things.
How are SRW locks different from vanilla reader/writer locks?

Answer 7 · 2019-03-12T12:41:42.000Z

It's the same things essentially that doesn't support recursive access unlike critical sections for example.

Answer 8 · 2019-03-12T13:40:14.000Z

I think I've found the way to solve this: after acquiring a synchronizer in the callback thread and triggering the callback itself, I'll set a variable internally to check in public API if it was acquired there or not. 👌

But yea, I think it would be nice to have a portable recursive synchronization mechanism.

Answer 9 · 2019-03-12T23:43:36.000Z

Have a look here: d6f4921

Currently a thread cannot acquire a synchroniser for read when the same synchroniser has already been acquired for write. Reason #1 is for implementation simplicity (acquire_rd() currently waits until a write in progress has completed but this behaviour can be changed). Reason #2 is that if a write is in progress (by this thread), how do we know the state of the protected data? Perhaps it is only half updated. We don't want to allow "torn" reads. And we cannot wait until the write has completed because we would be waiting for ourselves (deadlock). But a thread can call rwsync_r_acquire_rd() recursively and rwsync_r_acquire_wr() recursively.

If you like this, I can do the same for the rwlock.

Answer 10 · 2019-03-13T00:02:18.000Z

Awesome, thanks for this!

Yes, if you can do the same for the rwlock it would be great. ⭐️

Answer 11 · 2019-03-13T22:29:52.000Z

Fantastic, thanks! 🍻

Answer 12 · 2019-03-13T22:30:10.000Z

I am considering moving the owner field to a per-thread variable (just like the count variable). Then p64_rwlock_r_t would be identical to p64_rwlock_t (is this good or bad). The implementations could then also be merged and recursive mode always enabled. This would however require that the thread ID is always specified in the acquire_rd/wr calls... PROGRESS64 has no internal concept of thread ID.

Answer 13 · 2019-03-13T22:32:10.000Z

The implementation has not been stress tested, just some basic smoke testing using the rwlock_r example (including some negative cases that lead to program abort, I removed those test cases afterwards).

Answer 14 · 2019-03-13T22:33:20.000Z

Yea, no worries, I'll test it very soon. 👌

Answer 15 · 2019-03-17T14:03:06.000Z

@WonderfulVoid Thanks, everything works great!

Answer 16 · 2019-03-18T14:27:05.000Z

Here's a portable (I believe) way to get thread ID:

#ifndef UTILS_GETTID_H
#define UTILS_GETTID_H

#if defined(_WIN32)
	#ifndef __MINGW32__
		typedef DWORD pid_t;
	#endif
#else
	#include <unistd.h>
	#include <sys/syscall.h>
#endif

#ifdef  __cplusplus
extern "C" {
#endif

#if defined _WIN32
	inline static pid_t gettid(void) {
		return GetCurrentThreadId();
	}
#elif defined __APPLE__
	inline static pid_t gettid(void) {
		return syscall(SYS_thread_selfid);
	}
#else
	inline static pid_t gettid(void) {
		return syscall(__NR_gettid);
	}
#endif

#ifdef  __cplusplus
}
#endif

#endif //UTILS_GETTID_H

Answer 17 · 2019-03-18T14:51:54.000Z

Thanks, I will see if I can add this to a new porting layer.

Answer 18 · 2019-03-19T20:59:47.000Z

OK I have something working. The API will change, the tid parameter will be removed from the acquire calls.

Answer 19 · 2019-03-20T03:45:47.000Z

It works great, thanks! I only wish that we could call multiple reads simultaneously after the write progress completed, that would just perfect. ☁️

Answer 20 · 2019-03-20T06:54:17.000Z

Found this article, so he's using an additional critical section on top to achieve this.

Answer 21 · 2019-03-20T11:59:37.000Z

I don't understand your comment above.

I think the current design has a problem when a thread tries to acquire different locks. The per-thread rwl_count needs to be per-lock as well. The recursive rwsync_r does not have this problem, the recursive count is only used for write access so is associated with the synchroniser, not with the thread.

Answer 22 · 2019-03-20T15:49:08.000Z

I have an idea on how to solve the problem though. We will need a per-thread stack of acquired rwlocks.

Answer 23 · 2019-03-20T17:19:05.000Z

Looking forward for this. I'm happy to test any implementation. 👍

Answer 24 · 2019-03-21T00:43:22.000Z

It works great, thanks! I only wish that we could call multiple reads simultaneously after the write progress completed, that would just perfect.

Still don't understand this comment. Is it still relevant?

Answer 25 · 2019-03-21T05:39:40.000Z

Yes, due to:

//Not allowed to call acquire-read when the lock has already been acquired
//for write, protected data is in unknown state and the call cannot block
//waiting for the write to complete
//Recursive acquire-read calls allowed for the same rwlock
void p64_rwlock_r_acquire_rd(p64_rwlock_r_t *lock);

I'm getting rwlock_r: acquire-read after acquire-write when read trying to acquire a synchronizer if the write is in progress.

Answer 26 · 2019-03-21T09:58:05.000Z

Now you can acquire_rd after acquire_wr on the same lock. It is up to the application to handle the possibility of reading the protected data in the middle of its own update. You have been warned!

Answer 27 · 2019-03-21T12:51:25.000Z

Thanks, I'll do a couple of tests. 🔬

Answer 28 · 2019-03-23T15:19:09.000Z

It works as intended, great job. 👍

Answer 29 · 2019-03-24T21:13:43.000Z

Thanks. It was fun.