hls_intersect: Hardware stuck and returns error on mmio_read()
Closed this issue · 7 comments
R hw_snap_mmio_read32(0xc7e510, f000, 1) -1
R hw_snap_mmio_read32(0xc7e510, f000, 1) -1
R hw_snap_mmio_read32(0xc7e510, f000, 1) -1
R hw_snap_mmio_read32(0xc7e510, f000, 1) -1
R hw_snap_mmio_read32(0xc7e510, f000, 1) -1
R hw_snap_mmio_read32(0xc7e510, f000, 1) -1
I saw this in a failing intersect test-case. We should stop and return an error, but we seem not doing this, but instead continuing to poll for results on a broken card.
Seems I was creating corner case again?
Hi Frank, would you check the log again? sim.log
I see this error with following message:
top.ddr3_dimm.mem.rank[0].sodimm[8].ddr3.memory_write: at time 44414534.0 ps ERROR: Memory overflow. Write to Address 0080008 with Data xxxxxxxx0000000a will be lost.
You must increase the MEM_BITS parameter or define MAX_MEM.
/afs/bb/u/luyong/capi/mysnap/puresnap/hardware/ip/ddr3sdram_ex/imports/ddr3.v:756 if (STOP_ON_ERROR) $stop(0);
ncsim> exit
I have to delete the // in the first line of ip/ddr3sdram_ex/imports/ddr3.v
or modify sim/core/ddr3_dimm.sv
to add MAX_MEM and make model again. (For simluation to pass)
But thus it will generate many files and uses many disk space.
We know that 3 processes are invoked for a simulation:
1. Simulator
2. PSLSE
3. application
when the first one, i.e, NCSIM, meets some error condition and exit (like in this example) , script hardware/sim/run_sim
should have a way to terminate the application.
run_sim script knows the SIM_PID and is there a way to watch it? When it is terminated in middle, kill the Application also.
To reproduce this, you can run hls_intersect simulation in current master branch,
and even easier way is to kill NCSIM process in the middle. Then you will see tons of messages as Frank posted.
Hi @joergkayser , would you also have a look at this? Thanks a lot.
I tried to reproduce this case with Vivado 2016.4, card=ku3, action=hls_intersect, SDRAM_USED=TRUE, Simulator=irun. The error was:
the application hls_interact caused a memory overflow in the simulator irun. irun stopped running.
run_sim detected, that the simulator was gone, so it killed the other processes from the list, which was PSLSE and testlist.sh.
As hls_interact was a child to testlist.sh, it continued to run and was stopped alter, after the timeout expired.
I changed run_sim such, that it not only kills the recorded PIDs, but also all child processes generated by thos PIDs. Please test branch=sim_kill_childPID
Mhm. You hijacked my bug to discuss intersect problems. So leave this open even though I fixed the problem in the low-level code causing not to abort when getting a bad return code by the register read function.
Hi Frank, I think this has been fixed, right?