This project provides
- Python Tool to convert your trained caffe model into JSON
- C++ code that read in your JSON file and run it with OpenCL in:
CMakeLists.txt
for cpu in any OSMakefile
for mac CPUMakefile
for Nvidia GPUTcl building script
for Xilinx FPGA- (Sorry for no altera support right now, but I assume it could be extended easily)
Send an issue if you have any problems, or email me if you want to collaborate.
Here is a brief intro & example.
Some notes explaining this code base structure, which is helpful to you 😄 And two "how to add new feature" provided
This project is experimental at this moment, it has following (big) 💔 limitations:
-
Current only support limited layer types
- Conv
- Pooling
- Relu
-
Don't support
mergeLayer
,concatLayer
that accepts multiple inputs, although they can be added with some work. -
The on-chip buffer size has to be the maximum feature map size among all layers, which is not efficient at all. But it's difficult to do further trick on memory transfer under the FPGA OpenCL framework.
Because you can't implement a kernel to do a half convolution 😞 and data transmission is per-kernel and controlled by API.
I am willing to listen to your ideas.
Here are some of my thoughts in the top todo list.
- Contact Xilinx Support/Work with others to solve the on-chip cache buffer.
- Apply some optimizing attributes for benchmarking, which could be good examples for optimization.
- Implement
mergeLayer/concatLayer
, then it will basically be able to handle arbitrary structure network. - This project may be useful for an research about OpenCL FPGA design parameter tuning for neural network application, which is worthy to dig in 😃