ivmai/libatomic_ops

riscv support

oaken-source opened this issue · 36 comments

Are there any plans for adding support for riscv64 / riscv32 platforms?
Can I do anything to help? :)

ivmai commented

I haven't heard someone is working on it. I'm ready to accept the patches (to master branch).
In case of the GCC builtin atomic primitives are implemented correctly for the target, adding support of it is just writing several lines (like #include "generic.h").

+ @brucehoult

Generic support should work fine.

Any Linux-capable and/or multi-processor RISC-V implementation should include the "A" (Atomic) extension, which was designed to map directly to the requirements for the "release consistency" model. There are Load Reserved and Store Conditional instructions (with up to 16 instructions between them), and also single atomic RMW swap, add, and, or, xor, max, umax, min, umin that map directly to C++11 requirements.

Even the 32 bit microcontroller-class FE310 SoC in the HiFive1, LoFive etc implements RV32IMAC .e. has the atomic instructions.

I'll have a quad core 1.5 GHz 64 bit RISC-V Linux board in about six or eight weeks and would be happy to test BDWGC on it then. In the meantime the only option is qemu or HiFive1. The former supports Linux with multiple simulated cores, but unfortunately at present runs them all on a single host core. Fixing that is getting near the top of the qemu enhancement list.

@brucehoult 1. A versus !A is moot for this ticket because the gcc builtins will do the right thing (AMO or syscall) in either case. 2. multi-threaded QEMU is working now on qemu-upstream-v5

@sorear oh, that's great! I'm sure Michael Clark was saying only a few days ago that multiple core support doesn't work yet.

So if @ivmai is interested, he could follow the links from https://github.com/riscv/riscv-qemu/wiki to build the latest RISC-V QEMU and a simple linux image to run in it.

@sorear ha! In fact qemu-upstream-v5 won't build for me -- because of atomics! v4 and v3 are fine.

mkdir build
cd build
../configure --target-list=riscv64-linux-user,riscv64-softmmu
make

Gives...

bruce@HackPro:~/riscv/riscv-qemu/build$ time make
  CC      riscv64-linux-user/target/riscv/cpu.o
In file included from /home/bruce/riscv/riscv-qemu/include/qemu/osdep.h:36:0,
                 from /home/bruce/riscv/riscv-qemu/target/riscv/cpu.c:21:
/home/bruce/riscv/riscv-qemu/target/riscv/cpu.c: In function ‘riscv_cpu_has_work’:
/home/bruce/riscv/riscv-qemu/target/riscv/cpu.c:215:29: error: ‘CPURISCVState {aka struct CPURISCVState}’ has no member named ‘mip’
     return (atomic_read(&env->mip) & env->mie) != 0;
                             ^
/home/bruce/riscv/riscv-qemu/include/qemu/compiler.h:86:47: note: in definition of macro ‘QEMU_BUILD_BUG_ON’
 #define QEMU_BUILD_BUG_ON(x) _Static_assert(!(x), "not expecting: " #x)
                                               ^

And a number of others about both mip and mie.

@brucehoult check out the newest PR on riscv-qemu

@sorear doing only softmmu works (builds) in qemu-upstream-v5

yes, I committed a WFI improvement to qemu-upstream-v5 without testing linux-user, riscvarchive/riscv-qemu#104 fixes linux-user, sorry about that

@sorear after building the latest qemu and busybox linux image following instructions at https://github.com/riscv/riscv-qemu/wiki, it only uses one host core when I run:

for x in `seq 1 4`;do (while true; do echo thread_$x; done &); done

Looking at /proc/cpuinfo shows the emulated machine only has one core.

What exactly do I need to add to have multiple simulated cores using multiple host threads/cores?

@brucehoult are you on the qemu-upstream-v5 branch?

SMP was working in recent versions of riscv-qemu but it was multiplexed. @sorear added the correct locking around interrupt handling and enabled mttcg.

@michaeljclark yes I am, and I'm running the qemu command on the wiki page and it's not working. What else is needed? I've tried adding "-smp cores=4" or -smp cpus=4" and it doesn't change anything.

So /proc/cpuinfo is showing 4 cpus I assume?

@michaeljclark no, it's showing one

Oh. You need an SMP kernel then perhaps? The configs on the wiki do not have CONFIG_SMP

@michaeljclark detailed instructions, please :-)

@michaeljclark Linux ucbvax 4.14.0-00030-gc2d852c #1 Sun Feb 18 13:46:10 MSK 2018 riscv64 GNU/Linux

make ARCH=riscv menuconfig

  • Platform Type -> [*] Symmetric Multiprocessing
  • Platform Type -> (8) Maximum number of CPUs (2-32)
$ grep SMP .config
CONFIG_SMP=y
CONFIG_GENERIC_SMP_IDLE_THREAD=y
$ grep NR_CPU .config
CONFIG_NR_CPUS=8

@michaeljclark change riscv-linux/.config "# CONFIG_SMP is not set" to "=y" ? anything else?

@michaeljclark aha ok, thanks

That should be it. Note BBL is hardcoded to 8 thread stacks. So more than 8 CPUs won't work without patching bbl.

If you have a particularly big machine, you could do this to riscv-pk

mclark@minty:~/src/sifive/riscv-pk$ git diff machine/mtrap.h 
diff --git a/machine/mtrap.h b/machine/mtrap.h
index eafdb14..d417798 100644
--- a/machine/mtrap.h
+++ b/machine/mtrap.h
@@ -4,7 +4,7 @@
 #include "encoding.h"
 
 #ifdef __riscv_atomic
-# define MAX_HARTS 8 // arbitrary
+# define MAX_HARTS 32 // arbitrary
 #else
 # define MAX_HARTS 1
 #endif

@michaeljclark ok, got 4 cpus, and the above forking off four background shell scripts got qemu-system-riscv64 up to 280% CPu use, so that's something.

Next: networking is not working :(

But anyway, that might be enough to try Boehm GC tests. (cross-compiled of course)

ivmai commented

Thank you for the discussion and testing.

ivmai commented

Related topic in bdwgc: ivmai/bdwgc#208

ivmai commented

Are _GCC_HAVE_SYNC_COMPARE_AND_SWAP where n=1,2,4,8 (and 16 in case of riscv64) defined by the compiler?
I found another patch (https://github.com/AOSC-Dev/aosc-os-abbs/blob/staging/base-libs/libatomic-ops/autobuild/patches/libatomic_ops-riscv.patch) by @Icenowy that defines AO_GCC_FORCE_HAVE_CAS because _GCC_HAVE_SYNC_COMPARE_AND_SWAP are missing.

stage3:/# gcc -dM -E x.c | grep COMPARE
#define __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4 1
#define __GCC_HAVE_SYNC_COMPARE_AND_SWAP_8 1

This is probably the same bug as riscvarchive/riscv-gcc#15

bruce@HackPro:~$ riscv64-unknown-linux-gnu-gcc -dM -E - < /dev/null | grep SYNC
#define __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4 1
#define __GCC_HAVE_SYNC_COMPARE_AND_SWAP_8 1

ivmai commented

Could you please "make check" with CFLAGS_EXTRA="-DAO_GCC_FORCE_HAVE_CAS" ?

ivmai commented

bruce@HackPro:~$ riscv64-unknown-linux-gnu-gcc -dM -E - < /dev/null | grep SYNC

@bruce which compiler version?

Another question: is there double-word atomic operations?

@ivmai freshly built 7.2.0 from the latest master at https://github.com/riscv/riscv-gnu-toolchain which was commit 1b80cbe9 "Jan 31 17:51:30 Merge pull request #320 from riscv/expat-not-native"

The missing defines for the other sizes are I guess a rough edge that should be smoothed out sometime soon. Maybe @jim-wilson is the best person to know how likely or soon that will be.

and riscv does not have double-word atomic operations

You can get the missing atomic operations by adding -latomic. This is already done automatically when -pthread is used. There is a proposal to add this unconditionally, as some other packages need it too, such as jemalloc.

Eventually the missing atomic operations should be expanded inline. I don't know when that will happen.

ivmai commented

You can get the missing atomic operations by adding -latomic

@jim-wilson do you mean double-word ops?

@ivmai If double word ops are what you are missing, then yes. -latomic supports all of the atomic operations, and provides the ones that gcc can't open code. libatomic is built for all gcc targets, but some gcc targets can open code so many atomic operations that you never need to link with -latomic. RISC-V is not there yet, and hence -latomic is needed. Some other targets require -latomic also.

the double-pointer case is not lock-free though. the byte and halfword cases are lock-free, but gcc erroneously reports that they aren't.

ivmai commented

I've added a workaround for missing _GCC_HAVE_SYNC_COMPARE_AND_SWAP_1/2 - 393d7a7.
Please test.