cornell-zhang/hcl-dialect

Adding Line and Window Buffer to BNN

Closed this issue · 4 comments

smc447 commented

I am trying to add the line and buffer to BNN convolution using reuse_at
The heteroCl code is below

import heterocl as hcl
import hlib.bnn as bnn
import numpy as np
import checks as c
import inputs as i

hcl.init()
def bnn_conv(INPUT, w_conv1):
    conv1 = bnn.conv2d_nchw(INPUT,w_conv1, padding=[1, 1], name="conv1", out_dtype=hcl.Int(6))
    return conv1

INPUT = hcl.placeholder((1,1,16,16),"input", hcl.UInt(1))
w_conv1 = hcl.placeholder((16,1,3,3),"w_conv1", hcl.UInt(1))
s = hcl.create_schedule([INPUT, w_conv1], bnn_conv)
#s[conv1].unroll(bnn_conv.conv1.axis[1])
#s[bnn_conv.conv1].pipeline(bnn_conv.conv1.axis[1])
#s.buffer_at(bnn_conv.conv1, s[bnn_conv.conv1], bnn_conv.conv1.axis[0])

LB = s.reuse_at(INPUT, s[bnn_conv.conv1], bnn_conv.conv1.axis[2])
WB = s.reuse_at(LB, s[bnn_conv.conv1], bnn_conv.conv1.axis[3])

f_sim = hcl.build(s, target="vhls")
print(f_sim)

The error message is

Using mlir as IR
Done HCL-MLIR initialization
python3: /home/smc447/llvm-project/llvm/include/llvm/ADT/SmallVector.h:273: T& llvm::SmallVectorTemplateCommon<T, <template-parameter-1-2> >::operator[](llvm::SmallVectorTemplateCommon<T, <template-parameter-1-2> >::size_type) [with T = mlir::AffineExpr; <template-parameter-1-2> = void; llvm::SmallVectorTemplateCommon<T, <template-parameter-1-2> >::reference = mlir::AffineExpr&; llvm::SmallVectorTemplateCommon<T, <template-parameter-1-2> >::size_type = long unsigned int]: Assertion `idx < size()' failed.
 #0 0x00007fe01c2178df PrintStackTraceSignalHandler(void*) Signals.cpp:0:0
 #1 0x00007fe01c2155cc SignalHandler(int) Signals.cpp:0:0
 #2 0x00007fe178557630 __restore_rt sigaction.c:0:0
 #3 0x00007fe177aa7387 raise (/lib64/libc.so.6+0x36387)
 #4 0x00007fe177aa8a78 abort (/lib64/libc.so.6+0x37a78)
 #5 0x00007fe177aa01a6 __assert_fail_base (/lib64/libc.so.6+0x2f1a6)
 #6 0x00007fe177aa0252 (/lib64/libc.so.6+0x2f252)
 #7 0x00007fe01a47543f llvm::SmallVectorTemplateCommon<mlir::AffineExpr, void>::operator[](unsigned long) /home/smc447/llvm-project/llvm/include/llvm/ADT/SmallVector.h:274:0
 #8 0x00007fe01a45c42d mlir::hcl::runReuseAt(mlir::func::FuncOp&, mlir::hcl::ReuseAtOp&) /home/smc447/hcl-dialect-prototype/lib/Transforms/LoopTransformations.cpp:1332:0
 #9 0x00007fe01a46aa73 mlir::hcl::applyLoopTransformationOnSingleFunction(mlir::ModuleOp&, mlir::func::FuncOp&, std::map<std::string, mlir::hcl::CustomizationOp, std::less<std::string>, std::allocator<std::pair<std::string const, mlir::hcl::CustomizationOp>>>&) /home/smc447/hcl-dialect-prototype/lib/Transforms/LoopTransformations.cpp:3472:0
#10 0x00007fe01a46b0b7 mlir::hcl::applyLoopTransformation(mlir::ModuleOp&) /home/smc447/hcl-dialect-prototype/lib/Transforms/LoopTransformations.cpp:3535:0
#11 0x00007fe01a417279 loopTransformation(MlirModule&) //home/smc447/hcl-dialect-prototype/lib/Bindings/Python/HCLModule.cpp:75:0
#12 0x00007fe01a43c47f bool pybind11::detail::argument_loader<MlirModule&>::call_impl<bool, bool (*&)(MlirModule&), 0ul, pybind11::detail::void_type>(bool (*&)(MlirModule&), std::integer_sequence<unsigned long, 0ul>, pybind11::detail::void_type&&) && /home/smc447/anaconda3/envs/hlc-env/lib/python3.9/site-packages/pybind11/include/pybind11/cast.h:1442:0
#13 0x00007fe01a4392d7 _ZNO8pybind116detail15argument_loaderIJR10MlirModuleEE4callIbNS0_9void_typeERPFbS3_EEENSt9enable_ifIXntsrSt7is_voidIT_E5valueESC_E4typeEOT1_ /home/smc447/anaconda3/envs/hlc-env/lib/python3.9/site-packages/pybind11/include/pybind11/cast.h:1410:0
#14 0x00007fe01a43401c void pybind11::cpp_function::initialize<bool (*&)(MlirModule&), bool, MlirModule&, pybind11::name, pybind11::scope, pybind11::sibling>(bool (*&)(MlirModule&), bool (*)(MlirModule&), pybind11::name const&, pybind11::scope const&, pybind11::sibling const&)::'lambda1'(pybind11::detail::function_call&)::operator()(pybind11::detail::function_call&) const /home/smc447/anaconda3/envs/hlc-env/lib/python3.9/site-packages/pybind11/include/pybind11/pybind11.h:249:0
#15 0x00007fe01a434093 void pybind11::cpp_function::initialize<bool (*&)(MlirModule&), bool, MlirModule&, pybind11::name, pybind11::scope, pybind11::sibling>(bool (*&)(MlirModule&), bool (*)(MlirModule&), pybind11::name const&, pybind11::scope const&, pybind11::sibling const&)::'lambda1'(pybind11::detail::function_call&)::_FUN(pybind11::detail::function_call&) /home/smc447/anaconda3/envs/hlc-env/lib/python3.9/site-packages/pybind11/include/pybind11/pybind11.h:224:0
#16 0x00007fe01a424261 pybind11::cpp_function::dispatcher(_object*, _object*, _object*) /home/smc447/anaconda3/envs/hlc-env/lib/python3.9/site-packages/pybind11/include/pybind11/pybind11.h:929:0
#17 0x0000000000508127 _PyErr_Occurred /usr/local/src/conda/python-3.9.13/Include/internal/pycore_pyerrors.h:14:18
#18 0x0000000000508127 _Py_CheckFunctionResult /usr/local/src/conda/python-3.9.13/Objects/call.c:39:14
#19 0x0000000000508127 cfunction_call /usr/local/src/conda/python-3.9.13/Objects/methodobject.c:554:12
#20 0x00000000004f0edc _PyObject_MakeTpCall /usr/local/src/conda/python-3.9.13/Objects/call.c:191:18
#21 0x00000000004ed255 _PyObject_VectorcallTstate /usr/local/src/conda/python-3.9.13/Include/cpython/abstract.h:116:16
#22 0x00000000004ed255 _PyObject_VectorcallTstate /usr/local/src/conda/python-3.9.13/Include/cpython/abstract.h:103:1
#23 0x00000000004ed255 PyObject_Vectorcall /usr/local/src/conda/python-3.9.13/Include/cpython/abstract.h:127:12
#24 0x00000000004ed255 call_function /usr/local/src/conda/python-3.9.13/Python/ceval.c:5077:13
#25 0x00000000004ed255 _PyEval_EvalFrameDefault /usr/local/src/conda/python-3.9.13/Python/ceval.c:3489:23
#26 0x00000000004e70ca _PyEval_EvalCode /usr/local/src/conda/python-3.9.13/Python/ceval.c:4338:9
#27 0x00000000004f8515 _PyFunction_Vectorcall /usr/local/src/conda/python-3.9.13/Objects/call.c:404:1
#28 0x00000000004e83a1 _Py_CheckFunctionResult /usr/local/src/conda/python-3.9.13/Objects/call.c:38:8
#29 0x00000000004e83a1 _PyObject_VectorcallTstate /usr/local/src/conda/python-3.9.13/Include/cpython/abstract.h:119:12
#30 0x00000000004e83a1 PyObject_Vectorcall /usr/local/src/conda/python-3.9.13/Include/cpython/abstract.h:127:12
#31 0x00000000004e83a1 call_function /usr/local/src/conda/python-3.9.13/Python/ceval.c:5077:13
#32 0x00000000004e83a1 _PyEval_EvalFrameDefault /usr/local/src/conda/python-3.9.13/Python/ceval.c:3520:19
#33 0x00000000004e70ca _PyEval_EvalCode /usr/local/src/conda/python-3.9.13/Python/ceval.c:4338:9
#34 0x00000000004f8515 _PyFunction_Vectorcall /usr/local/src/conda/python-3.9.13/Objects/call.c:404:1
#35 0x00000000004e920a _Py_CheckFunctionResult /usr/local/src/conda/python-3.9.13/Objects/call.c:38:8
#36 0x00000000004e920a _PyObject_VectorcallTstate /usr/local/src/conda/python-3.9.13/Include/cpython/abstract.h:119:12
#37 0x00000000004e920a PyObject_Vectorcall /usr/local/src/conda/python-3.9.13/Include/cpython/abstract.h:127:12
#38 0x00000000004e920a call_function /usr/local/src/conda/python-3.9.13/Python/ceval.c:5077:13
#39 0x00000000004e920a _PyEval_EvalFrameDefault /usr/local/src/conda/python-3.9.13/Python/ceval.c:3537:19
#40 0x00000000004e70ca _PyEval_EvalCode /usr/local/src/conda/python-3.9.13/Python/ceval.c:4338:9
#41 0x00000000004e6d57 _PyEval_EvalCodeWithName /usr/local/src/conda/python-3.9.13/Python/ceval.c:4361:12
#42 0x00000000004e6d09 PyEval_EvalCodeEx /usr/local/src/conda/python-3.9.13/Python/ceval.c:4384:1
#43 0x0000000000594e7b PyEval_EvalCode /usr/local/src/conda/python-3.9.13/Python/ceval.c:834:1
#44 0x00000000005c2307 run_eval_code_obj /usr/local/src/conda/python-3.9.13/Python/pythonrun.c:1222:8
#45 0x00000000005be270 _Py_DECREF /usr/local/src/conda/python-3.9.13/Include/object.h:422:8
#46 0x00000000005be270 run_mod /usr/local/src/conda/python-3.9.13/Python/pythonrun.c:1243:5
#47 0x00000000004563ed pyrun_file.cold /usr/local/src/conda/python-3.9.13/Python/pythonrun.c:1140:15
#48 0x00000000005b8062 pyrun_simple_file /usr/local/src/conda/python-3.9.13/Python/pythonrun.c:450:13
#49 0x00000000005b8062 PyRun_SimpleFileExFlags /usr/local/src/conda/python-3.9.13/Python/pythonrun.c:483:15
#50 0x00000000005b55ce _Py_DECREF /usr/local/src/conda/python-3.9.13/Include/object.h:422:8
#51 0x00000000005b55ce _Py_XDECREF /usr/local/src/conda/python-3.9.13/Include/object.h:497:9
#52 0x00000000005b55ce pymain_run_file /usr/local/src/conda/python-3.9.13/Modules/main.c:380:5
#53 0x00000000005b55ce pymain_run_python /usr/local/src/conda/python-3.9.13/Modules/main.c:604:21
#54 0x00000000005b55ce Py_RunMain /usr/local/src/conda/python-3.9.13/Modules/main.c:683:5
#55 0x0000000000588ff9 Py_BytesMain /usr/local/src/conda/python-3.9.13/Modules/main.c:1130:1
#56 0x00007fe177a93555 __libc_start_main (/lib64/libc.so.6+0x22555)
#57 0x0000000000588eae _start (/home/smc447/anaconda3/envs/hlc-env/bin/python3.9+0x588eae)
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Aborted

It is a workaround. I think the issue is not that easy to fix as the reuse_at logic is already very complicated. I would suggest using a batch size larger than one for testing.

Side note: the issue mentioned in the code comment is that trip count 1 loops get removed before hcl.reuse_at is applied, causing errors in the LoopTransformation pass.

It turns out this isn't a bug, but the error is definitely not informative. The issue is that there's a padding operation before convolution in bnn.nchw, and we should perform reuse_at on the output of the padding operation. Changing the first reuse_at to the following should solve the problem.

LB = s.reuse_at(bnn_conv.conv1_pad, s[bnn_conv.conv1], bnn_conv.conv1.axis[2])