Undefined behavior in synthesis depending on the bit width of variables
Opened this issue ยท 17 comments
In the provided example if you change the bit width of counter
from 6 to any other number the design stops working, UART starts falling out of sync and the reads from the memory are incorrect.
This example is supposed to write to a random location in a PCB (Olimex) mounted PSRAM, a random set of bytes, then read from the same location.
##How to recreate this issue
Run build by typing
cd 3.build
make hw_all
You can see the results of writing to memory and reading from memory by running the python script, located in 4.testing, after building and uploading the bitstream. You should see something like this:
Then change the bit width of counter on line 73 where the comment is:
logic [5:0] counter; //**change this to 5 or 4 bits, stops working although the MSB isn't used for anything
and run the python script again, you should see something like this:
Since there is no PSRAM verilog model in this particluar case, all the things that could have been verified through simulation have been. Nevertheless, changing the bit width shouldn't in any way have an impact like this one
I don't have a board right now to test it, but you should make sure to reset all counter
bits, not only counter[0]
:
That line will reset all counter bits, irrespective of how many they are.
The resulting procedural block after conversion to top.v
is counter <= 1'sb0;
:
always @(posedge clk_out or negedge arst_n)
if (arst_n == 1'b0)
counter <= 1'sb0;
else if (counter == 9) begin
tick_1us_reg <= 1'b1;
counter <= 0;
end
else begin
tick_1us_reg <= 1'b0;
counter <= counter + 1;
end
Synthesis seems to interpret this correctly. I didn't check this at first.
Then please provide a suitable test bench so that we can find what is supposed to be wrong.
This issue can be attributed to the precision of the instantiated UART, latest commit solves it with increasing the precision of the UART. In the same repo folder, there is now a PSRAM simulation model which you can use to verify the operations of the design.
If you take a look at RTL simulation and post-PnR simulation there is no difference, but for some reason the timing of the uart_tx
has been changed when changing the bit width - this is deduced by observing the read data on the serial port, which is almost always halfway correct when changing bit widths. This may be a consequence of a deeper issue mentioned here.
If you take a look at RTL simulation and post-PnR simulation there is no difference, [...]
It seems that some things are still missing so that I can simulate it properly. Can you please check that the files are up to date? The makefile has no targets for any simulation. The testbench complains about conflicting drivers on the inout ports. I'd like to see the same results in the simulation first, as you do, then I can get a better picture of it. Thanks!
I just pulled the repo clean and run all the steps listed here to simulate. Here's what I got for the RTL, post-synth, and post-PnR sim, in the listed order, top to bottom. Essentially the uart_tx transmits 0x0708
if it runs successfully.
If you encounter no simulation targets in the makefile that is because the simulation is done in 2.sim
as listed in the steps, and 3.build
is just for synthesis and PnR.
Also the simulations are now all done in Verilator, and iverilog is not currently supported, which solves the "conflicting drivers on the inout ports".
Also the simulations are now all done in Verilator, and iverilog is not currently supported, which solves the "conflicting drivers on the inout ports".
Why change the simulator to avoid the error messages?
For a more detailed analysis/co-simulation on our side, at least the post-synthesis and post-implementation netlists should be simulatable by iverilog. I need its features such as timing annotation.
@pu-cc any experience or example you can offer about simulating inout
ports and multi-drop, tri-state buses with iverilog
?
Tarik's example above simulates fine with commercial tools on EDA Playground and, surprisingly, even with nominally 2-state Verilator.
@DadoCCAG please consider adding Verilator to GateMate simulation stack, and advise if you need anything else from Tarik in order to address this issue.
any experience or example you can offer about simulating inout ports and multi-drop, tri-state buses with iverilog?
I don't understand that. It works very well, and if you need an example, please take a look at e.g. this testbench with spiflash
, whose ports are inout
:
https://github.com/pu-cc/picorv32/tree/gatemate-demo/picosoc
@pu-cc we found the bug in the sim model, and now iverilog runs fine for the RTL and post-synthesis simulation, but it stays at 0 ticks in post-pnr sim, which is likely due to the way the sim models are written. Still I get no error so I don't know what is the exact reason, but from my side, RTL sim and post-synth sim work both with verilator and iverilog now.
Is this stuck-at-time-0
post-PNR sim using 0-delay
mode? Have you also tried the back-annotated SDF
mode?
Does Icarus have an option for unit delay
sim? If so, it's also worth trying...
@tarik-ibrahimovic this is great news that can help me with debugging. Regarding the stuck post-implementation simulation, that just crossed my mind, have you checked that the OUT_CLK
parameter in CC_PLL
is not zero? See cpelib.v.
Sorry, I forgot to update the tools to the latest version it works fine now. To continue with solving the problem: as with verilator, RTL sim, post-synth sim, and post-pnr sim are matching, and look like the inserted figure below.
Previous problem was: when I change the counter
bit width from 6 bits to anything else, I get a non-functioning design.
Now with the new tools the design doesn't work in any way I try, shown in the image below.
The post-PnR sim and real chip operation are mismatching.
@tarik-ibrahimovic have you checked tool-reported Fmax against the actual clock applied to the board, per this?
If there are timing-closure issue, a back-annotated PNR sim should reveal them...
Thank you @tarik-ibrahimovic! I can also reproduce the behavior here. I'm working on it.
@chili-chips-ba worst corner timing should be at least 60 MHz, according to my logs. So that shouldn't be any problem.