exercism/x86-64-assembly

X86-64 Assembly Pangram: Empty solution passes all tests

Closed this issue · 8 comments

The non-op default asm file that comes with exercism download --exercise=pangram --track=x86-64-assembly is:

section .text
global is_pangram
is_pangram:
    ; Provide your implementation here
    ret

When unignoring all tests in pangram_test.c and then callling make, I receive:

$ make
pangram_test.c:57:test_empty_sentence:PASS
pangram_test.c:58:test_perfect_lower_case:PASS
pangram_test.c:59:test_only_lower_case:PASS
pangram_test.c:60:test_missing_the_letter_x:PASS
pangram_test.c:61:test_missing_the_letter_h:PASS
pangram_test.c:62:test_with_underscores:PASS
pangram_test.c:63:test_with_numbers:PASS
pangram_test.c:64:test_missing_letters_replaced_by_numbers:PASS
pangram_test.c:65:test_mixed_case_and_punctuation:PASS
pangram_test.c:66:test_case_insensitive:PASS

-----------------------
10 Tests 0 Failures 0 Ignored
OK

I investigated this a bit more and tried to set a return value explicitely:

section .text
global is_pangram
is_pangram:
    mov rax, 42
    ret

I found that every value set to rax other than 0 and 1 caused all tests to pass. 0 and 1 caused some tests to fail and some to pass, as I would expect.

My next idea was to address this in pangram_test.asm by replacing e.g. TEST_ASSERT_FALSE(is_pangram("")) with TEST_ASSERT(0 == is_pangram("")), but to no avail: Values other than 0 or 1 in rax still make that test pass.

iHiD commented

(cc @exercism/x86-64-assembly)

@fawick Thanks, that's interesting!

I'm not sure what the C standard says about type bool, but the compiler seems to only use the number 1 as true, and 0 as false. And the size is one byte.

Looking at the disassembly, we see the following:

Test using TEST_ASSERT_FALSE:

0000000000001177 <test_empty_sentence>:
    1177:   55                      push   rbp
    1178:   48 89 e5                mov    rbp,rsp
    117b:   48 8d 3d 86 2e 00 00    lea    rdi,[rip+0x2e86]        # 4008 <_IO_stdin_used+0x8>                                                                
    1182:   e8 99 02 00 00          call   1420 <is_pangram>
    1187:   83 f0 01                xor    eax,0x1
    118a:   84 c0                   test   al,al
    118c:   75 11                   jne    119f <test_empty_sentence+0x28>     
    118e:   be 10 00 00 00          mov    esi,0x10
    1193:   48 8d 3d 6f 2e 00 00    lea    rdi,[rip+0x2e6f]        # 4009 <_IO_stdin_used+0x9>
    119a:   e8 9c 23 00 00          call   353b <UnityFail>
    119f:   90                      nop
    11a0:   5d                      pop    rbp
    11a1:   c3                      ret

The return value is xored with 0x1 and the low byte is checked if zero. This means that if one of the low 8-bits are set to one, the test will pass. Since we will have garbage in rax, there is a good chance that one of those bits are one.

42 is 101010 in binary, xored with 0x1 gives 101011. The low byte is not zero, the test will pass.

257 is 100000001 in binary, xored with 0x1 gives 100000000. The low byte is zero and the test will fail.

Test using TEST_ASSERT_TRUE:

00000000000011a2 <test_perfect_lower_case>:
    11a2:   55                      push   rbp
    11a3:   48 89 e5                mov    rbp,rsp
    11a6:   48 8d 3d 75 2e 00 00    lea    rdi,[rip+0x2e75]        # 4022 <_IO_stdin_used+0x22>
    11ad:   e8 6e 02 00 00          call   1420 <is_pangram>
    11b2:   84 c0                   test   al,al
    11b4:   75 11                   jne    11c7 <test_perfect_lower_case+0x25> 
    11b6:   be 15 00 00 00          mov    esi,0x15
    11bb:   48 8d 3d 7b 2e 00 00    lea    rdi,[rip+0x2e7b]        # 403d <_IO_stdin_used+0x3d>
    11c2:   e8 74 23 00 00          call   353b <UnityFail>
    11c7:   90                      nop    
    11c8:   5d                      pop    rbp
    11c9:   c3                      ret

The low byte of rax is checked for zero.

42 is 101010 in binary. The low byte is not zero, the test will pass.

256 is 100000000. The low byte is zero and the test will fail.

I don't know if this is a problem or not. Sure, it's not right that the tests are passing with an empty solution, but I don't think anyone would believe they have actually solved the exercise correctly. What do you think @iHiD?

Changing the return type to int instead of bool would make the tests fail for an empty solution.

Edit: I guess this answers the question about the type bool: https://godbolt.org/z/l8toBZ

It would also be possible to call a dummy function for clearing rax before running the students solution.

NobbZ commented

I'm not sure what the C standard says about type bool

At the university they said, that according to the standard there is no bool, but everything not 0 should be considered true.

My assumption is that this is actually related to the history of C, as it was created as a portable assembly kind of.

And most if not all conditional branching in Assembler is based on (in) equality to zero as well.

@NobbZ Yes, most of the time a non-zero value is considered to be true. However, in C99, the type _Bool was introduced, which is what is being used here.

According to https://en.cppreference.com/w/c/language/arithmetic_types#Boolean_type:

capable of holding one of the two values: 1 and 0

In this case, the compiler is counting on it being one byte in size and holding the value 1 or 0.

kotp commented

in C99, the type _Bool was introduced,
They had to go make things complicated, didn't they? 😜

I don't know if this is a problem or not. Sure, it's not right that the tests are passing with an empty solution, but I don't think anyone would believe they have actually solved the exercise correctly.

I agree with that. However, I wonder whether I can actually rely on having all tests pass as an indicator for successfully solving the exercise or whether it is by mere accident (e.g. my getting the return value in rax wrong).

Good point! I will fix this by switching to int as the return type. There are probably other exercises with the same problem.