Gekkio/mooneye-gb

Request for details on a test comment ( about OAM DMA )

mehcode opened this issue · 2 comments

I'm having trouble understand something I read in one of your timing tests that deals with OAM DMA.

https://github.com/Gekkio/mooneye-gb/blob/master/tests/acceptance/call_cc_timing.s#L83-L86

  ; the first two bytes of CALL nn will be at $FDFE, so
  ; the high byte of nn is at the first byte of OAM during testing
  ; [...]
  ; the memory read of nn is aligned to happen exactly one cycle
  ; before the OAM DMA end, so high byte of nn = $FF
  ; therefore the call becomes:
  ;   call c, $ffca

From what I understand the CPU is only able to access HIRAM during OAM DMA.. so wouldn't that turn into RST $38 ($FF) instead?


Another piece of that test that is confusing to me is the per-cycle timings of CALL.

; CALL cc, nn is expected to have the following timing:
; M = 0: instruction decoding
; M = 1: nn read: memory access for low byte
; M = 2: nn read: memory access for high byte
; M = 3: internal delay
; M = 4: PC push: memory access for high byte
; M = 5: PC push: memory access for low byte

You state above that the low byte is read first. That seems to go against what the test expects if its expecting the low byte to be read correctly and the high byte to be read as $FF.

CPU is only able to access HIRAM during OAM DMA

This is a very common myth, and the truth is a lot more complicated. HIRAM is the only area which the CPU can access without any chance of conflicts. Other memory areas are still accessible but you might get a bus conflict and end up doing something unexpected. The basic idea is this: the Game Boy has several buses, and if the CPU performs a read from a bus where the OAM DMA is doing stuff, OAM DMA wins. For example, if the CPU wants to read from $2000, and OAM DMA wants to read from $3000, both will see the byte from $3000. I'm still researching this, and will publish full details once I've got a more complete understanding of this topic.

You state above that the low byte is read first.

Let's talk about the instruction CALL $1234 which consist of the following bytes: CD 34 12. Note that in my terminology, 34 is the "low byte" because it's the lower part of the target address.

Now, in the first test case the instruction is written to $FDFE before triggering OAM DMA, so the memory layout is this:

$FDFE: CD
$FDFF: 34
$FE00: 12

$FDFE and $FDFF are always accessible because I'm intentionally avoiding OAM DMA bus conflicts. $FE00 on the other hand is not, because the entire OAM area is locked down during OAM DMA. Based on the per-cycle timings of CALL, we know that the CPU will read the bytes in order: CD, 34, 12. And the last byte (= "high byte" of the target address) 12 will be replaced with FF if the OAM DMA is still running.

Thanks for the awesome level of detail.

The bus conflict stuff is interesting. I wonder if there are any games that inadvertently require this.

For the low/high byte.. yeah it makes perfect sense now. I was probably just staring at it for too long. I need to learn to take more breaks.

I didn't mean to ask you twice. Sorry about that.