[Feature] Rewind memory access loops
redbudgithubsec opened this issue · 1 comments
redbudgithubsec commented
Issue
Currently the loops for accessing top level variables that are automatically generated with pipelining at II=1 which is great. However, in my testing this can still lead to 10x the theoretical runtime for 2d arrays.
Solution
Adding rewind to the end of the automatically generated pipeline pragmatism fully solves this performance issue while sometimes also reducing hardware usage.
Example - My matrix vector multiply program.
Without rewind (current setup):
78 cycle interval for buf1
redbudgithubsec commented
I'm sorry this is Zack, I'm just on the wrong account.