emu-russia/dmgcpu

Test the cc check circuit

Closed this issue · 11 comments

There are reports that the cc check circuit (RET cc, JP cc, etc.) is not working correctly.
It is necessary to make a test rom for testing and visual inspection of signals.
At the same time to process the results and supplement the description on the wiki (it is assumed that the cc_check circuit is located in ALU random logic, associated with the signal ALU_Out1, which goes to the sequencer).

@ogamespec I don't think there is an error in the ALU logic, at least not as far as I have checked. The condition is checked in az[11], and then passed to azo[11] and then to ALU_Out1. When ALU_Out1 is low, the branch is taken. CLK6 gates the connection between az[11] and azo[11].

The problem appears to be in the slight delays in the phases of the CLK*s in the dmg-sim simulator. The CLK6 goes low 28ns before CLK2 positive edge, which I believe triggers the state change in the Sequencer. So the Sequencer always sees ALU_Out1 as low, and always takes the branch.

Running dmg-sim with the delays zeroed out makes it work.

I was currently tracing the clock paths to see the origin of the delay. CLK2 appears to have 12 entity connections more than CLK6, from their common ancestors, which explains the delay (each entity has 2ns or more).

I will see if I apply a HACK delay to CLK6 to see if everything works, and then I'll make a PR with the temporary fix.

image

In the image, you can see ALU_Out1 (due to CLK6) going low before the state change (due to CLK9, I think, though it was CLK2 before, so the delay is actually 22ns).

cc_check

I checked the schematics and topology - everything looks correct. I updated a bit the corresponding section on the wiki (alu.md).
As for CLK delay spacing - I think it's some kind of devil :) imho it's easier to stretch the levels wider so that the circuits have time to settle. All the same, we are dealing with hybrid logic (half made by Latch, half by DFF), that is, in the current design I have little idea whether it can be synthesized in hardware (FPGA).

In any case, arranging delays is beyond my expertise (I'm at the developmental level of "connect A and B as a netlist and be happy") 😃

Test ROM & waves:

// Check the part of the circuit that deals with checking the condition code ("cc check").
// For this purpose we will use RET cc instructions as the most convenient for verification;
// Knowingly false cc checks will be performed so as not to interrupt code flow

00 // nop
3e 00 // ld a, 0    <-- a = 0
3c // inc a 		<-- a = 1  (ZF=0)
c8 // ret z 		<-- return if ZF == 1
3d // dec a 		<-- a = 0  (ZF=1)
c0 // ret nz 		<-- return if ZF == 0
37 // scf 			<-- CF = 1
d0 // ret nc 		<-- return if CF == 0
3f // ccf 			<-- CF = 0
d8 // ret c 		<-- return if CF == 1
76 // halt

image

Also in ALU.v I found such a note:

	// Dynamic part
	// TBD: Check if it is necessary to add transparent DLatch for dynamic logic outputs (on inverter gates) or if this will do.

	assign azo[0] = CLK2 ? az[0] : 1'b1;
	assign azo[1] = CLK7 ? (CLK6 ? az[1] : 1'b1) : 1'b1;		// -> bc5
	assign azo[2] = CLK7 ? (CLK6 ? az[2] : 1'b1) : 1'b1;		// -> bc1
	assign azo[3] = CLK2 ? az[3] : 1'b1;

Technically, if you have dead-time between CLKs in your simulation, DLatch should save the day. I'll look at the topology some more and add it most likely.

Actually, the fix is better applied upstream, delaying the signals CLK3, CLK4, CLK5, and CLK6 by 22 ns. But I will still make a PR here with the debug signals and some comments I added, at least.

The path for the CLK* signals that I traced are the following (only the critical path, some signals are NANDs with other intermediary clock signals):

- CLK9                 <- BOGA <- BALY
- CLK1         <- AWOB <- BOGA <- BALY
- CLK2 <- BEDO <- BYXO <- BUVU <- BALY <- BYJU <- BELE <- BUTO <- BAZE <- BELO <- BANE <- BEJA <- BOLO <- BUFA <- BERY <- BAPY

- CLK3 <- BEKO <- BUDE <- BIRY <- BELU <- ~ATYP & CLK_EAN
- CLK4 <- UVYT <- BUDE

- CLK5 <- BOLO <- BUFA
- CLK6         <- BUFA <- BERU <- BAPY

BAPY <- ~(ATYP | AROV | ~CLK_ENA) = ~ATYP & ~AROV & CLK_ENA
ATYP <- AFUR
AROV <- APUK

@msinger Do you have any idea how accurate the delays of these signals are? Do you think just delaying CLK3, CLK4, CLK5, and CLK6 by 22ns is a valid fix?

I'll add DLatch tomorrow where they are in the actual chip. It won't be worse :)
At the same time we will check whether we need to make delays or will work without them.

@Rodrigodd Please, try #271 + #272 without CLK delays :)

image

It so happens that asymmetric CLK6/CLK7 are used for flags and cc_check, so if you put DLatch on the output of these random logic trees, it "extends" the result as needed (see picture).

@Rodrigodd, I just pulled those delays out of my butt. I guessed them based on the size of the cells. Once I have created the layouts for all cells, I will change them based on the number of transistors that are in series. The simulation currently doesn't recreate the glitches I see on the ~RD signal of the real device. And there is some volume envelope glitch in the APU that can happen on the real device, which doesn't happen in the simulation. I forgot how exactly this works though. This also indicates that the delays are wrong.
I don't know what happens if you just delay all clocks by some fixed value.

I guessed them based on the size of the cells.

Yeah, but I think it is unlikely that 3 cells have a bigger delay than 15 cells, which is what my fix was relying on.

Please, try #271 + #272 without CLK delays

This one is a much more sound solution. Tested it on dmg-sim without the delays and it is now working:

image

Thanks again @ogamespec! I believe this issue is now fixed.

Let's move on :)