volatile inline assembly
jacob-carlborg opened this issue · 8 comments
I have some inline assembly that LDC optimizes away when using the -O3
flag. The documentation of LDC's inline assembly refers to GDC's documentation, which refers to GCC's. With GCC's extended inline assembly you can use asm volatile
to avoid optimizing the inline assembly. I tried asm volatile {
, but that resulted in syntax errors. Is there a corresponding way to do that in D?
Here's a reduced test case:
import ldc.attributes;
@naked extern (C) void foo()
{
bar();
asm
{
q"ASM
cli
1: hlt
jmp 1b
ASM";
}
}
noreturn bar()
{
while (true) {}
}
Compiling the above code with: ldc2 --output-s main.d -c -betterC -mtriple i386-freestanding
, produces the following assembly:
.text
.file "main.d"
.section .text.foo,"ax",@progbits
.globl foo
.p2align 4, 0x90
.type foo,@function
foo:
.cfi_startproc
calll _D6foobar3barFZNn@PLT
#APP
cli
.Ltmp0:
hlt
jmp .Ltmp0
#NO_APP
.Lfunc_end0:
.size foo, .Lfunc_end0-foo
.cfi_endproc
.section .text._D6foobar3barFZNn,"ax",@progbits
.globl _D6foobar3barFZNn
.p2align 4, 0x90
.type _D6foobar3barFZNn,@function
_D6foobar3barFZNn:
.cfi_startproc
pushl %ebp
.cfi_def_cfa_offset 8
.cfi_offset %ebp, -8
movl %esp, %ebp
.cfi_def_cfa_register %ebp
jmp .LBB1_1
.LBB1_1:
movb $1, %al
testb $1, %al
jne .LBB1_2
jmp .LBB1_4
.LBB1_2:
jmp .LBB1_3
.LBB1_3:
jmp .LBB1_1
.LBB1_4:
popl %ebp
.cfi_def_cfa %esp, 4
retl
.Lfunc_end1:
.size _D6foobar3barFZNn, .Lfunc_end1-_D6foobar3barFZNn
.cfi_endproc
.ident "ldc version 1.39.0"
.section ".note.GNU-stack","",@progbits
Adding the -O3
flag produces this assembly:
.text
.file "main.d"
.section .text.foo,"ax",@progbits
.globl foo
.p2align 4, 0x90
.type foo,@function
foo:
.cfi_startproc
.p2align 4, 0x90
.LBB0_1:
jmp .LBB0_1
.Lfunc_end0:
.size foo, .Lfunc_end0-foo
.cfi_endproc
.section .text._D6foobar3barFZNn,"ax",@progbits
.globl _D6foobar3barFZNn
.p2align 4, 0x90
.type _D6foobar3barFZNn,@function
_D6foobar3barFZNn:
.p2align 4, 0x90
.LBB1_1:
jmp .LBB1_1
.Lfunc_end1:
.size _D6foobar3barFZNn, .Lfunc_end1-_D6foobar3barFZNn
.ident "ldc version 1.39.0"
.section ".note.GNU-stack","",@progbits
With optimizations enabled the cli
and hlt
instructs from the inline assembly have been removed and the call to bar
has been inlined.
Seems I can add the @optStrategy("none")
to foo
as a workaround.
I doubt the asm itself is optimized, it seems just optimized away because it comes after an infinite loop.
Yes, that seems to be the case.
Indeed I think the reason the asm code is optimized away is due to the infinite loop or noreturn
annotation of `bar.
To answer your question about the volatile
equivalent, I think specifying "memory"
as clobber for the asm code may do the trick (add : : : "memory"
to your asm sequence). https://stackoverflow.com/questions/14449141/the-difference-between-asm-asm-volatile-and-clobbering-memory
Adding : : : "memory"
did not help unfortunately.
What do you expect with that infinite loop? Is there a clang equivalent where the asm is kept? I very much doubt so.
Is there a clang equivalent where the asm is kept? I very much doubt so.
You are correct. Both Clang and GCC removes the inline assembly regardless if volatile
and/or : : : "memory"
is used. Clang even removed it without optimizations enabled when I used _Noreturn
.
What do you expect with that infinite loop?
It's part of an OS kernel. I'm not an expert in this subject but, as far as I understand, an interrupt can break/pause the infinite loop.
But I probably won't need the infinite loop anyway. I think I can close this issue. Thanks for the input.
Yeah I guess the @optStrategy("none")
workaround is the best option here (and sufficient).