m-j-w/CpuId.jl

Segfault on Julia 0.7-DEV due to new inliner

Closed this issue · 9 comments

m-j-w commented

Calling cpuinfo() segfaults when the new Julia 0.7-DEV inliner logic is applied. Apparently, the information of modified output registers is neglected.

@code_llvm cpuinfo()

gives

; Location: /home/mjw/Code/Julia/CpuId/src/CpuId.jl:961
; Function address_size; {
; Location: /home/mjw/Code/Julia/CpuId/src/CpuId.jl:632
; Function hasleaf; {
; Location: /home/mjw/Code/Julia/CpuId/src/CpuId.jl:62
  %21 = call { i32, i32, i32, i32 } asm sideeffect "cpuid", "={eax},={ebx},={ecx},={edx},{eax},{ecx},~{dirflag},~{fpsr},~{flags}"(i32 -2147483648, i32 0) #5
  %22 = extractvalue { i32, i32, i32, i32 } %21, 0
;}
  %23 = icmp ult i32 %22, -2147483640
  br i1 %23, label %L94, label %L64

L64:                                              ; preds = %L35
; Location: /home/mjw/Code/Julia/CpuId/src/CpuId.jl:634
  %24 = call { i32, i32, i32, i32 } asm sideeffect "cpuid", "={eax},={ebx},={ecx},={edx},{eax},{ecx},~{dirflag},~{fpsr},~{flags}"(i32 -2147483640, i32 0) #5
  %25 = extractvalue { i32, i32, i32, i32 } %24, 0
; Location: /home/mjw/Code/Julia/CpuId/src/CpuId.jl:635
  %26 = lshr i32 %25, 8
  %27 = and i32 %26, 255
  %28 = zext i32 %27 to i64       # <=== This is the bextrl instruction seen below
  br label %L94

L94:                                              ; preds = %L64, %L35
  %"#temp#16.0" = phi i64 [ %28, %L64 ], [ 0, %L35 ]
;}
...

which seems okay.

The assembly however assumes ebx to not be changed.

@code_native cpuinfo()

gives

...
; Location: CpuId.jl:961
; Function address_size; {
; Location: CpuId.jl:632
; Function hasleaf; {
; Location: CpuId.jl:62
	movl	$2147483648, %eax       # imm = 0x80000000
	xorl	%ecx, %ecx
	cpuid
;}
	cmpl	$2147483656, %eax       # imm = 0x80000008
	jb	L250
; Location: CpuId.jl:634
	movl	$2147483656, %eax       # imm = 0x80000008
	xorl	%ecx, %ecx
	cpuid                   # <=== Modifies eax, ebx, ecx, edx
; Location: CpuId.jl:635
	movl	$2056, %ecx             # imm = 0x808
	bextrl	%ecx, %eax, %r12d
L250:
	movq	%rsi, 32(%rbx)  # <=== Writes to random address stored in rbx => Segfault
	xorl	%r13d, %r13d
;}
...

When tagging the underlying functions as @noinline, or disabling the optimizer with julia -O0 everything works fine.

Thanks for the package, really neat. I am considering dropping Hwloc.jl as a dependency in favor of CpuId.jl. The only information I need is the number of physical cores in the CPU. Do you think I would loose some portability? I saw that you you wrote on the README that AMD processors aren't supported yet.

Please let me know of your plans to add support to Julia v0.7 and the extent to which the function cpucores() is portable to AMD.

m-j-w commented

@juliohm I'll give the update to 0.7 a try in the next days. There are also some AMD features currently not working. AMD uses e.g different flags for cache sizes.

In general, this approach can work for all x86 compatible CPUs, Intel and AMD, that is all CPUs that have the cpuid instruction. So that limits portability obviously. But includes quite a large number of usecases.
It excludes, however, e.g. ARM CPUs.

A library such as hwloc has of course a somewhat large industry consortium behind it, that can ensure correctness and completeness for querying all sorts of CPUs. On the other hand, a pure Julia library written in Julia does not require other external packages. For me, the latter suffices my usecases to esily determine number of cores and cache sizes and not having to add static hostname lists for comparison to my code.

Since I only need the number of physical cores, I think CpuId.jl is a much nicer solution. Please let me know when you have a version tagged for Julia v0.7, and I will remove Hwloc.jl from my packages.

Hi @m-j-w, please let me know if you have a deadline in mind, and I will plan accordingly. I understand if you are too busy with work. :)

m-j-w commented

I'm waiting for a bugfix to become available in the Windows 64 Julia nightly build. Doesn't seem to get updated currently.

m-j-w commented

@juliohm You find my current work in progress in this branch

Thank you @m-j-w , I am looking forward to it :)

m-j-w commented

Okay, tests succeed for Linux, Mac and Win64. Win32 fails. Vendor string detection not yet correct due to compilation issues with llvm code.

m-j-w commented

Closed by #25