JuliaHubOSS/llvm-cbe

GEPs for pointers to zero-sized arrays don't work

hikari-no-yume opened this issue · 1 comments

Similar issue to #88.

Relevant bit of LLVM IR (generated by rustc):

  %winner_text.sroa.0.0 = phi [0 x i8]* [ bitcast (<{ [12 x i8] }>* @alloc78 to [0 x i8]*), %bb23 ], [ bitcast (<{ [13 x i8] }>* @alloc79 to [0 x i8]*), %bb22 ], [ bitcast (<{ [11 x i8] }>* @alloc80 to [0 x i8]*), %_ZN26noughts_and_crosses_no_std11find_winner17hb6620231b8be5649E.exit ], [ bitcast (<{ [11 x i8] }>* @alloc80 to [0 x i8]*), %bb2.backedge.i ]
  %6 = getelementptr [0 x i8], [0 x i8]* %winner_text.sroa.0.0, i64 0, i64 0 

Some of the relevant C code:

  void* llvm_cbe_winner_text_2e_sroa_2e_0_2e_0;
  void* llvm_cbe_winner_text_2e_sroa_2e_0_2e_0__PHI_TEMPORARY;
// …
llvm_cbe_bb22:
  llvm_cbe_winner_text_2e_sroa_2e_0_2e_0__PHI_TEMPORARY = ((void*)(&alloc79));   /* for PHI node */
  goto llvm_cbe_bb24;
  
llvm_cbe_bb23:
  llvm_cbe_winner_text_2e_sroa_2e_0_2e_0__PHI_TEMPORARY = ((void*)(&alloc78));   /* for PHI node */
  goto llvm_cbe_bb24;

llvm_cbe_bb24:
  llvm_cbe_winner_text_2e_sroa_2e_0_2e_0 = llvm_cbe_winner_text_2e_sroa_2e_0_2e_0__PHI_TEMPORARY; 
  puts(((&(*llvm_cbe_winner_text_2e_sroa_2e_0_2e_0).array[((int64_t)0)])));
  return 0;

The problem is this bit:

puts(((&(*llvm_cbe_winner_text_2e_sroa_2e_0_2e_0).array[((int64_t)0)])));

I think the correct C for this GEP would be something like (*(uint8_t **)&var_name). What I'm not sure about is how best to get there. Maybe we should rethink our zero-sized-type handling and start lowering [0 x Ty]* as Ty*? I'll experiment a bit.

I think the correct C for this GEP would be something like (*(uint8_t **)&var_name). What I'm not sure about is how best to get there. Maybe we should rethink our zero-sized-type handling and start lowering [0 x Ty]* as Ty*? I'll experiment a bit.

I've tried this now. I made printTypeName skip through empty array types when printing a pointer's element type, and I made printGEPExpression output plain pointer arithmetic for zero-element arrays ( + (x) rather than .array[x]), similar to what it already does for vectors.

It seems to do what I want for this example:

  uint8_t* llvm_cbe_winner_text_2e_sroa_2e_0_2e_0;
  uint8_t* llvm_cbe_winner_text_2e_sroa_2e_0_2e_0__PHI_TEMPORARY;
// …
llvm_cbe_bb22:
  llvm_cbe_winner_text_2e_sroa_2e_0_2e_0__PHI_TEMPORARY = ((uint8_t*)(&alloc79));   /* for PHI node */
  goto llvm_cbe_bb24;

llvm_cbe_bb23:
  llvm_cbe_winner_text_2e_sroa_2e_0_2e_0__PHI_TEMPORARY = ((uint8_t*)(&alloc78));   /* for PHI node */
  goto llvm_cbe_bb24;

llvm_cbe_bb24:
  llvm_cbe_winner_text_2e_sroa_2e_0_2e_0 = llvm_cbe_winner_text_2e_sroa_2e_0_2e_0__PHI_TEMPORARY;
  puts(((&(*llvm_cbe_winner_text_2e_sroa_2e_0_2e_0))));
  return 0;

Of course, I probably should test a few scenarios. Maybe there's some other instructions I need to specially account for.

Edit: Made this into a pull request, though it's still not very well tested: #129