cornell-zhang/heterocl

Stream_inference failed when stacking two convolutional layers

Opened this issue · 1 comments

When I stack two conv layers as shown below,

def double_conv():
    A = hcl.placeholder((1,1,16,16), "A")
    w1 = hcl.placeholder((1,3,3,3), "w1")
    w2 = hcl.placeholder((3,6,3,3), "w2")

    def kernel(A, w1, w2):
        conv1 = hlib.op.nn.conv2d_nchw(A, w1, padding=[1,1], name="conv1")
        conv2 = hlib.op.nn.conv2d_nchw(conv1, w2, padding=[1,1], name="conv2")
        return conv2
    
    target = hcl.platform.zc706
    s = hcl.create_schedule([A, w1, w2], kernel)
    s.to([A, w1, w2], target.xcel)
    s.to(kernel.conv2, target.host)
    target.config(compile="vivado_hls", mode="csim")
    f = hcl.build(s, target)

    np_A = np.random.randint(0, 256, size=(1,1,16,16))
    np_w1 = np.random.randint(0, 10, size=(1,3,3,3))
    np_w2 = np.random.randint(0, 10, size=(3,6,3,3))
    np_B = np.zeros((1,6,16,16))

    hcl_A = hcl.asarray(np_A)
    hcl_w1 = hcl.asarray(np_w1)
    hcl_w2 = hcl.asarray(np_w2)
    hcl_B = hcl.asarray(np_B)
    f(hcl_A, hcl_w1, hcl_w2, hcl_B)

I got the following error.

Traceback (most recent call last):
  File "simple.py", line 110, in <module>
    double_conv()
  File "simple.py", line 94, in double_conv
    f = hcl.build(s, target)
  File "/home/chz/heterocl/python/heterocl/api.py", line 318, in build
    return _build(schedule.sch, new_inputs, target=target, name=name, stmt=stmt)
  File "/home/chz/heterocl/python/heterocl/tvm/build_module.py", line 543, in build
    return build_fpga_kernel(sch, args, target, name=name)
  File "/home/chz/heterocl/python/heterocl/tvm/build_module.py", line 422, in build_fpga_kernel
    flist = lower(sch, args, kernel_only=True, name=name)
  File "/home/chz/heterocl/python/heterocl/tvm/build_module.py", line 376, in lower
    stmt = ir_pass.InferStream(stmt, arg_list)
  File "/home/chz/heterocl/python/heterocl/tvm/_ffi/function.py", line 280, in my_api_func
    return flocal(*args)
  File "/home/chz/heterocl/python/heterocl/tvm/_ffi/_ctypes/function.py", line 183, in __call__
    ctypes.byref(ret_val), ctypes.byref(ret_tcode)))
  File "/home/chz/heterocl/python/heterocl/tvm/_ffi/base.py", line 66, in check_call
    raise TVMError(py_str(_LIB.TVMGetLastError()))
heterocl.tvm._ffi.base.TVMError: [17:27:37] src/pass/stream_inference.cc:1514: Check failed: bind_buffer_map_[name].get() == op->buffer_var.get() 

Stack trace returned 10 entries:
[bt] (0) /home/chz/heterocl/tvm/lib/libhcl.so(dmlc::StackTrace[abi:cxx11]()+0x53) [0x7ff9d7c5b353]
[bt] (1) /home/chz/heterocl/tvm/lib/libhcl.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x39) [0x7ff9d7c5bb99]
[bt] (2) /home/chz/heterocl/tvm/lib/libhcl.so(TVM::ir::BufferReplacer::Mutate_(Halide::Internal::Allocate const*, Halide::Internal::Stmt const&)+0x125) [0x7ff9d7cd2905]
[bt] (3) /home/chz/heterocl/tvm/lib/libhcl.so(+0x2f1034) [0x7ff9d7d20034]
[bt] (4) /home/chz/heterocl/tvm/lib/libhcl.so(std::_Function_handler<Halide::Internal::Stmt (TVM::NodeRef const&, Halide::Internal::Stmt const&, TVM::ir::IRMutator*), TVM::IRFunctor<Halide::Internal::Stmt (TVM::NodeRef const&, Halide::Internal::Stmt const&, TVM::ir::IRMutator*)>::set_dispatch<Halide::Internal::Allocate>(std::function<Halide::Internal::Stmt (Halide::Internal::Allocate const*, Halide::Internal::Stmt const&, TVM::ir::IRMutator*)>)::{lambda(TVM::NodeRef const&, Halide::Internal::Stmt const&, TVM::ir::IRMutator*)#1}>::_M_invoke(std::_Any_data const&, TVM::NodeRef const&, Halide::Internal::Stmt const&, TVM::ir::IRMutator*&&)+0x43) [0x7ff9d7d2c613]
[bt] (5) /home/chz/heterocl/tvm/lib/libhcl.so(TVM::IRFunctor<Halide::Internal::Stmt (TVM::NodeRef const&, Halide::Internal::Stmt const&, TVM::ir::IRMutator*)>::operator()(TVM::NodeRef const&, Halide::Internal::Stmt const&, TVM::ir::IRMutator*) const+0x13a) [0x7ff9d7c5c14a]
[bt] (6) /home/chz/heterocl/tvm/lib/libhcl.so(TVM::ir::IRMutator::Mutate(Halide::Internal::Stmt)+0x5d) [0x7ff9d7c5c26d]
[bt] (7) /home/chz/heterocl/tvm/lib/libhcl.so(TVM::ir::IRMutator::Mutate_(Halide::Internal::AttrStmt const*, Halide::Internal::Stmt const&)+0x95) [0x7ff9d7d25505]
[bt] (8) /home/chz/heterocl/tvm/lib/libhcl.so(+0x2f0e74) [0x7ff9d7d1fe74]
[bt] (9) /home/chz/heterocl/tvm/lib/libhcl.so(std::_Function_handler<Halide::Internal::Stmt (TVM::NodeRef const&, Halide::Internal::Stmt const&, TVM::ir::IRMutator*), TVM::IRFunctor<Halide::Internal::Stmt (TVM::NodeRef const&, Halide::Internal::Stmt const&, TVM::ir::IRMutator*)>::set_dispatch<Halide::Internal::AttrStmt>(std::function<Halide::Internal::Stmt (Halide::Internal::AttrStmt const*, Halide::Internal::Stmt const&, TVM::ir::IRMutator*)>)::{lambda(TVM::NodeRef const&, Halide::Internal::Stmt const&, TVM::ir::IRMutator*)#1}>::_M_invoke(std::_Any_data const&, TVM::NodeRef const&, Halide::Internal::Stmt const&, TVM::ir::IRMutator*&&)+0x43) [0x7ff9d7d2c453]

It seems again a naming issue like #197 , but after I manually rename the compute functions with the same name, this issue remains.

A simple workaround here is to rename all the compute function by adding prefixes to them. For example, rename the padding function in conv layer

temp = pad(Input, pad_before, pad_after, name=name+"_pad") # name is passed in by users

Also, hcl.sum in conv should also be renamed.

Issue #197 has not been tackled. We need to find a good way to fix this problem.