the wrong result of "vmerge_vvm_i32m1"
Erucaaa opened this issue · 5 comments
i try to test the intrinsic "vmerge_vvm_i32m1", but i got the wrong result. There are my test code as follows:
void col_trn4(int32_t *a)
{
vint32m1_t v0,v1,v2,v3,v4,v5,v6,v7,v8,v9,v10,v11,v12,v13,v14,v15,v16,v17,v18;
vint32m1_t v0temp,v1temp,v2temp,v3temp,v4temp,v5temp,v6temp,v7temp,v8temp,v9temp,v10temp;
size_t vl = vsetvl_e32m1 (4);
vuint32m1_t col_index = vle32_v_u32m1(index_array, vl);
v0 = vle32_v_i32m1(a,vl);
v1 = vle32_v_i32m1(a+4,vl);
v2 = vle32_v_i32m1(a+8,vl);
v3 = vle32_v_i32m1(a+12,vl);
// cacul mask
int32_t flags[4] = {2,1,2,1};
vint32m1_t v = vle32_v_i32m1 ( flags , vl ) ;
vbool32_t mask = vmseq_vx_i32m1_b32(v , 1 , vl ) ;//0,1,0,1
v4 = vmerge_vvm_i32m1(mask,v0,v2,vl);
v5 = vmerge_vvm_i32m1(mask,v1,v3,vl);
v6 = vmerge_vvm_i32m1(mask,v2,v0,vl);
vse32_v_i32m1(a, v4, vl);
vse32_v_i32m1(a+4, v5, vl);
#vse32_v_i32m1(a+8, v6, vl);
v7 = vmerge_vvm_i32m1(mask,v3,v1,vl);
}
the test array : a ={6,30,10,26,
18,45,29,30,
29,48,34,33,
36,53,40,49}
The result is:
v4: 6 48 10 33
v5: 18 53 29 49
but when i add a instruction"vse32_v_i32m1(a+8, v6, vl)", the result of v4 is changed!
v4: 6 30 10 26
v5: 18 53 29 49
v6: 29 48 34 33
WHY? I'm so confused about this.
I don't see anything obviously wrong. What compiler are you using?
I don't see anything obviously wrong. What compiler are you using?
/opt/gcc10.2/native/lib/gcc/riscv64-linux-gnu/10.2.0/specs
but when I set a new variable array b to store the result, the print is true.....
int32_t b[16] = {0};
vse32_v_i32m1(b, v4, vl);
vse32_v_i32m1(b+4, v5, vl);
vse32_v_i32m1(b+8, v6, vl);
I didn't know that gcc 10.2 supported RISC-V vector intrinsics, but I'm most familiar with clang.
It seems that you are using old and obsolete RVV intrinsic API.
For example, you should use __riscv_vse32_v_i32m1 instead of vse32_v_i32m1.
To get latest and stable RVV feature, you should use latest GCC (GCC-14).
It's simple, replace "gcc" directory in https://github.com/riscv-collab/riscv-gnu-toolchain with
https://github.com/gcc-mirror/gcc
Then build it. You will get latest RVV support.
It seems a toolchain implementation bug and also it's not an upstream toolchain, so close.