a ncnn example of segment-anything
- the image_embeddings maybe take a long time, because of some MultiHeadAttention ops isn't fused.
- maybe we should use pnnx to optimize this.
- ViT-B SAM model
models are available in Baidu Pan and Google Drive
mkdir -p build
cd build
cmake ..
op type avg time(ms) %
MatMul 2268.23 21.14%
Reshape 2199.36 20.51%
InnerProduct 1899.4 17.71%
GELU 1809.85 16.87%
BinaryOp 1351.21 12.61%
Softmax 513.15 4.78%
Permute 442.24 4.12%
Crop 106.28 0.99%
LayerNorm 65.11 0.61%
Padding 35.7 0.33%
Convolution 34.43 0.32%
MemoryData 2.76 0.03%
Split 0.00 0%
total time: 10727.72