fastmachinelearning/hls4ml

Problem tracing binary CNN model after recent tracing optimization

Closed this issue · 8 comments

Prerequisites

Please make sure to check off these prerequisites before submitting a bug report.

  • Test that the bug appears on the current version of the master branch. Make sure to include the commit hash of the commit you checked out.
  • Check that the issue hasn't already been reported, by checking the currently open issues.
  • If there are steps to reproduce the problem, make sure to write them down below.
  • If relevant, please include the hls4ml project files, which were created directly before and/or after the bug.

Quick summary

Error when running tracing on a binary cnn (though I do not think the fact that the network is binary is important).

Details

When I tried to run tracing on a binary CNN (from #740):

https://gist.github.com/jmitrevs/0455fde237ccdc778c619e853e64dde6

it causes an error:

Traceback (most recent call last):
  File "/Users/jmitrevs/work/hls4ml/test/pytest/profile_binary_cnn.py", line 103, in <module>
    test_model2("Vivado", "io_parallel")
  File "/Users/jmitrevs/work/hls4ml/test/pytest/profile_binary_cnn.py", line 98, in test_model2
    keras_trace = hls4ml.model.profiling.get_ymodel_keras(model2, X)
  File "/Users/jmitrevs/work/hls4ml/hls4ml/model/profiling.py", line 592, in get_ymodel_keras
    and layer.activation.__name__ != "linear"
AttributeError: 'NoneType' object has no attribute '__name__'. Did you mean: '__ne__'?

The code seems to be new, from #863.

Steps to Reproduce

Run the file in the linked "gist" using the main branch of hls4ml.

I am including @AdrianAlan and @jmduarte because they were involved in the latest tracing updates.

Thanks for catching that. I will take a look.

There's a fix for this in #961 (that PR needs bit more work to clean up)

I am not sure it's a complete fix. When I attempted the fairly straightforward fix of checking that it's not None, I ran into isuses when activation was a string.

We should validate the fix on this, as well.

The fix LGTM.

@jmitrevs What do you mean when you say you ran into issues? When you add, let's say, activation='relu', in L33 in your example, the output looks all right, but the affected layer is out of order. Is that what you refer to?

Maybe it does fix it. I just checked out the PR, and I get no error. I thought that it complained about a string activation having __name__ extracted, but I don't see that problem any more. So I think once the #961 goes in we can maybe close this.

This has been fixed with the merge of #987 (which replaced #961)