chili-chips-ba/openCologne

Improper UART synthesis

Closed this issue ยท 15 comments

When implementing a simple UART test, sending 2 RTL hard-coded bytes every 1.6 ms on tx, strange results were observed: Some combinations of bytes transmit correctly, some don't. Important notes: rx functionality works fine, tx of one byte works fine.
Example of bytes which produce good transmission:
example_good_exec
Example of bytes which produce incorrect transmission:
example_wrong_exec

After delving deeper into the problem the following has been concluded: RTL simulation goes well, Synthesis netlist simulation fails.
RTL simulation has been verified with both Verilator and Icarus Verilog, while post-synthesis simulation has been done only using Icarus Verilog.
RTL_sim_verilator
Post-synthesis simulation fail also produces a uart_state which goes unknown after the reset (not supposed to happen).
post_syn_sim

Provided is a synthesis log file which has one ABC warning: synth.log

Here you can find a full test folder structure, including the python script for reading from the FPGA UART
uart_test_openCologne.zip.

... let's also add that this UART design is hardware-proven in multiple applications, using both Gowin and Xilinx silicon, implemented using their respective proprietary tools.

We wonder if this Yosys issue has to do with FSM recoding:

image

@tarik-ibrahimovic , can you rerun Yosys with auto-FSM recoding disabled? In any case, report this SIM-SYNTH discrepancy to Yosys

Disabling FSM recoding yields in a different but still non-functioning result
image
image

The fact that we are getting different result for a change in Yosys options is a good data point, and indicative in itself.

@pu-cc, could you please take a quick look at the test package that Tarik prepared, reproduce this problem, and provide your expert insights. Note the availability of both pre- and post- synthesis simulation.

Tarik is in the meantime trying to instrument this design with your recently announced ILA...

From what I see, this issue also affects the synth_xilinx pass, but synth_ecp5 is fine.

I took a closer look and found inconsistencies in the mapping of the two registers tick_1us_reg and uart_buffer_state. In both cases, mapping of the cell type $_DFFE_xxxx_ to CC_DFF led to an incorrect initial value. Commit pu-cc/yosys@2c25ece should fix it. Simulation was fine after synthesis. I will definitely have to run some more tests, but can you please check this, too? Thank you.

Synthesis is fine now, but PnR isn't. Take a look at the two simulations (post-PnR on top and post-synthesis on bottom). Post-synthesis simulation works well now, it's matching the RTL sim. Run impl_sim in 2.sim to see post PnR and compare. Also provided is the implementation and synthesis log file.
image
impl.log
synth2.log

@pu-cc, Tarik has repeated his testing and provided the following statement:

This is to confirm that my test was already done with the latest Yosys gatemate-regmap-fix and CC PnR tool, Version 4.2, from May 1 2024. Nevertheless, I have rebuilt them and tried everything again. The bad news is that I've got the same results. From what we can tell, gatemate-regmap-fix has indeed fixed the synthesis issue. RTL and post-synth simulation now match, which was not the case before. In the provided example, UART is simply sending 0xABCD repeatedly on the TX pin. I'm pointing this out as a response to "... but instead I see that my post-synth results already look different from yours...". Could you please check it on your side too?!

In the figures below, top to bottom, we have:

  1. RTL sim
  2. Post-Synth sim
  3. post-PNR sim

As we can see, RTL sim matches the post-Synth sim but it doesn't match the post-PNR sim. Just in case, I tested the generated bitstream on hardware (second figure) and the results were consistent with the post-PNR sim:

  RTL sim  = Post-Synth sim --> 0xABCD on UART Tx line
  Post-PNR sim = Hardware --> 0xABEF on UART Tx line

Everything seems fine at first glance, but I'll get back to you after more thorough testing. For now, there is an issue with the impl_sim and synth_sim not running, even though the hardware implements correctly.

Heres the error message from iverilog:

/home/tibrahimovic/0.git-repo/cc-toolchain-linux/cc-toolchain-linux/bin/p_r/cpelib.v:1793: warning: always process may not have any delay.
/home/tibrahimovic/0.git-repo/cc-toolchain-linux/cc-toolchain-linux/bin/p_r/cpelib.v:1793: A runtime infinite loop may be possible.

This is an iverilog warning that was activated by '-Winfloop' in the IVLFLAGS variable in your Makefile. You can deactivate it again by omitting it.

We should strive to keep this check in place, as it adds value.

  • Any room for enhancing the cpelib.v not to trigger this warning?

The warning can get sorted out with replacing the cpelib.v always block with an initial one:

/* 	
always begin
   #(clk_period/4.0) baseclk = ~baseclk;
end 
*/

initial begin
   baseclk = 0;
   forever begin
      #(clk_period/4.0) baseclk = ~baseclk;
   end
end

But the post-pnr simulation doesn't run (image below, sim stays at 0 ticks), nor show any apparent errors with the new cpelib.v, even when the design doesn't use the PLL. Post-synth simulation is fine.
image

This may be an Icarus Verilog problem, because Verilator is running ok. But since Icarus is the default simulator in the prebuilt binaries it would be best for it to also function properly.

Edit: On other projects even the post-synth sim doesn't work in Icarus, neither in Verilator, again the simulation doesn't advance. Again I stumbled upon a RTL-hardware inconsistencies which I wanted to take a closer look at but now is extremely challenging.

I have doubts about your statement that the post-implementation simulation in verilator is OK. It has the same issue as icarus:

The PLL model expects at least the OUT_CLK parameter. Please keep in mind that this value must be a real data type, not a string. This may be the reason why it does not work in some designs if you have specified it as a string.

The post-implementation netlist still contains the actual configuration vector, without info on the OUT_CLK value. I could see if P&R could also write this parameter into the netlist. This should result in proper functioning of the PLL model during post-implementation simulation.

...
.PLL_CFG (96'h01_CB_01_10_64_00_04_05_08_0A_20_82)
...

So, it will not work in any simulator. If you'd like to test it yourself, you can enter OUT_CLK manually as a test.

The original test design you provided did not have any PLL at all. Please verify with these sources and confirm that the actual error of #17 has been fixed. We'll discuss about the PLL in another thread.

Yes, the issues in Synthesis and Place and Route addressed here have been fixed, I am now closing this issue.