volatile inline assembly
jacob-carlborg opened this issue · 8 comments
I have some inline assembly that LDC optimizes away when using the -O3
flag. The documentation of LDC's inline assembly refers to GDC's documentation, which refers to GCC's. With GCC's extended inline assembly you can use asm volatile
to avoid optimizing the inline assembly. I tried asm volatile {
, but that resulted in syntax errors. Is there a corresponding way to do that in D?
Here's a reduced test case:
import ldc.attributes;
@naked extern (C) void foo()
1: hlt
jmp 1b
noreturn bar()
while (true) {}
Compiling the above code with: ldc2 --output-s main.d -c -betterC -mtriple i386-freestanding
, produces the following assembly:
.file "main.d"
.section .text.foo,"ax",@progbits
.globl foo
.p2align 4, 0x90
.type foo,@function
calll _D6foobar3barFZNn@PLT
jmp .Ltmp0
.size foo, .Lfunc_end0-foo
.section .text._D6foobar3barFZNn,"ax",@progbits
.globl _D6foobar3barFZNn
.p2align 4, 0x90
.type _D6foobar3barFZNn,@function
pushl %ebp
.cfi_def_cfa_offset 8
.cfi_offset %ebp, -8
movl %esp, %ebp
.cfi_def_cfa_register %ebp
jmp .LBB1_1
movb $1, %al
testb $1, %al
jne .LBB1_2
jmp .LBB1_4
jmp .LBB1_3
jmp .LBB1_1
popl %ebp
.cfi_def_cfa %esp, 4
.size _D6foobar3barFZNn, .Lfunc_end1-_D6foobar3barFZNn
.ident "ldc version 1.39.0"
.section ".note.GNU-stack","",@progbits
Adding the -O3
flag produces this assembly:
.file "main.d"
.section .text.foo,"ax",@progbits
.globl foo
.p2align 4, 0x90
.type foo,@function
.p2align 4, 0x90
jmp .LBB0_1
.size foo, .Lfunc_end0-foo
.section .text._D6foobar3barFZNn,"ax",@progbits
.globl _D6foobar3barFZNn
.p2align 4, 0x90
.type _D6foobar3barFZNn,@function
.p2align 4, 0x90
jmp .LBB1_1
.size _D6foobar3barFZNn, .Lfunc_end1-_D6foobar3barFZNn
.ident "ldc version 1.39.0"
.section ".note.GNU-stack","",@progbits
With optimizations enabled the cli
and hlt
instructs from the inline assembly have been removed and the call to bar
has been inlined.
Seems I can add the @optStrategy("none")
to foo
as a workaround.
I doubt the asm itself is optimized, it seems just optimized away because it comes after an infinite loop.
Yes, that seems to be the case.
Indeed I think the reason the asm code is optimized away is due to the infinite loop or noreturn
annotation of `bar.
To answer your question about the volatile
equivalent, I think specifying "memory"
as clobber for the asm code may do the trick (add : : : "memory"
to your asm sequence). https://stackoverflow.com/questions/14449141/the-difference-between-asm-asm-volatile-and-clobbering-memory
Adding : : : "memory"
did not help unfortunately.
What do you expect with that infinite loop? Is there a clang equivalent where the asm is kept? I very much doubt so.
Is there a clang equivalent where the asm is kept? I very much doubt so.
You are correct. Both Clang and GCC removes the inline assembly regardless if volatile
and/or : : : "memory"
is used. Clang even removed it without optimizations enabled when I used _Noreturn
What do you expect with that infinite loop?
It's part of an OS kernel. I'm not an expert in this subject but, as far as I understand, an interrupt can break/pause the infinite loop.
But I probably won't need the infinite loop anyway. I think I can close this issue. Thanks for the input.
Yeah I guess the @optStrategy("none")
workaround is the best option here (and sufficient).