capstone-rust/capstone-rs

Inconsistency with operand string

Closed this issue · 3 comments

I've been experimenting with Capstone in Rust and have come across a small inconsistency with the operand printing for immediate values.

This snippet disassembles a single mov instruction using AT&T syntax configuration.

let result = capstone.disasm_all(&[0x49, 0xc7, 0xc0, 0x01, 0x00, 0x00, 0x00], 0x1000).expect("Failed to disassemble.");

let instruction = result.first().expect("Should contain at least one instruction.");

println!("{}", instruction);

This gives the following result. Notice the immediate value supplied to mov is represented in hexadecimal.

0x1000: movq $0x1, %r8

If I change the immediate value to -1 instead of 1:

let result = capstone.disasm_all(&[0x49, 0xc7, 0xc0, 0xff, 0xff, 0xff, 0xff], 0x1000).expect("Failed to disassemble.");

let instruction = result.first().expect("Should be at least one instruction.");

println!("{}", instruction);

... then the result is ...

0x1000: movq $18446744073709551615, %r8

The value is not represented in hexadecimal for the 2nd example. I would have expected the result to be:

movq $0xFFFFFFFFFFFFFFFF, %r8

I'm using Keystone alongside Capstone which has helped me narrow down this particular issue. The majority of code I've disassembled and reassembled so far works well.

If I switch the syntax to Intel, the result is correct:

0x1000: mov r8, 0xffffffffffffffff

It looks like I have a workaround as the issue is only present for AT&T mode.

Thanks for filing the issue! The heuristic to print in hex or decimal is part of the C library capstone based on HEX_THRESHOLD. For example:

https://github.com/aquynh/capstone/blob/c72fc8185ed4088c3486f621d150fbcf5f980aa0/arch/X86/X86ATTInstPrinter.c#L695-L698

If you want to get a feature added, or tweak the logic, I recommend filling an issue with the upstream Capstone library.

Ah, I wasn't too sure before filing the ticket if this was within the C implementation or Rust. If I try the same thing within a C project and the result is a little different. Due to this, I just assumed it was a type conversion issue between the bindings implementation.

0x1000:	movq		$-0x10, %r8

Thanks for info! 👍🏻