rust-lang/rust-by-example

22.1 Inline Assembly

1000oaks opened this issue · 2 comments

Hi, I am reading Rust by Example recently.
I think in Late output operands Section of 22.1 Inline Assembly, the code example:

use std::arch::asm;

let mut a: u64 = 4;
let b: u64 = 4;
let c: u64 = 4;
unsafe {
    asm!(
        "add {0}, {1}",
        "add {0}, {2}",
        inout(reg) a,
        in(reg) b,
        in(reg) c,
    );
}
assert_eq!(a, 12);

The inout(reg) a should be modified to inlateout(reg) a here, to make the code example in accordance with following content "but if you want optimized performance (release mode or other optimized cases), it could not work.".

Thanks!

Hi, I had a similar thought, but if you read on in the example, you will see:

"However it must allocate a separate register for a since it uses inout and not inlateout."

...which suggests the use of inout was intended. Though, this contradicts the earlier statement:

"...but if you want optimized performance (release mode or other optimized cases), it could not work."

...which suggests the use of inlateout was intended.

So, I think the overall wording of the example has to be made consistent and correspond to inout(reg) a, or changed (but still be consistent) to use inlateout(reg) a.

I would have to defer to someone who understands the assembly here better to make the change. In particular, when I tried running the example with inlateout(reg) a it would result in a = 16 (in release mode), however, I'm not 100% sure on why that is. (If someone could help explain why that is in a step-by-step fashion, I'd be happy to submit a PR to try and make the wording and example consistent.

After further research, I believe I understand why release mode is giving a = 16 when using inlateout(reg) a. In particular, when using inlateout(reg) a and running in release mode or with optimizations, presumably only one register is allocated for all three variables, after which the first add instruction results in 4 + 4 = 8, and the second add instruction then becomes 8 + 8 = 16, and this output is then bound to a.

Hence, I have submitted PR #1766 to try to fix the wording.