tuan3w/visual_search

TypeError: 'Convolution2D' object has no attribute '__getitem__'

zjsjack opened this issue · 22 comments

I convent the VGG16 model to pkl as instructed by https://github.com/mitmul/chainer-faster-rcnn/issues/15:

import pickle
from chainer.links.caffe import CaffeFunction

vgg16 = CaffeFunction('VGG_ILSVRC_16_layers.caffemodel')
pickle.dump(vgg16, open('VGG16.model', 'wb'))

Got TypeError when run python index.es.py.

2017-06-23 09:23:26.732486: I tensorflow/core/common_runtime/gpu/gpu_device.cc:940] Found device 0 with properties:
name: Tesla K80
major: 3 minor: 7 memoryClockRate (GHz) 0.8235
pciBusID 0000:00:1e.0
Total memory: 11.17GiB
Free memory: 11.11GiB
2017-06-23 09:23:26.732530: I tensorflow/core/common_runtime/gpu/gpu_device.cc:961] DMA: 0
2017-06-23 09:23:26.732539: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0: Y
2017-06-23 09:23:26.732555: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla K80, pci bus id: 0000:00:1e.0)
Namespace(es_host='localhost', es_index='im_data', es_port=9200, es_type='obj', input='tf-faster-rcnn/data/demo/', model_path='./coco_2014_train+coco_2014_valminusminival/vgg16_faster_rcnn_iter_490000.ckpt.data-00000-of-00001', weight='./VGG16.model')
Loading caffe weights...
Done!
Traceback (most recent call last):
File "index_es.py", line 104, in
sess=sess)
File "/home/ubuntu/workspace/visual_search/visual_search/extractor.py", line 53, in init
tag='default', anchor_scales=anchors)
File "/home/ubuntu/workspace/visual_search/visual_search/lib/nets/vgg16.py", line 496, in create_architecture
rois, cls_prob, bbox_pred = self._vgg16_from_imagenet(sess, training)
File "/home/ubuntu/workspace/visual_search/visual_search/lib/nets/vgg16.py", line 303, in _vgg16_from_imagenet
net = self._conv_layer(sess, self._image, "conv1_1", False)
File "/home/ubuntu/workspace/visual_search/visual_search/lib/nets/vgg16.py", line 111, in _conv_layer
filt=self._get_conv_filter(sess, name, trainable=trainable)
File "/home/ubuntu/workspace/visual_search/visual_search/lib/nets/vgg16.py", line 64, in _get_conv_filter
w=self._caffe2tf_filter(name)
File "/home/ubuntu/workspace/visual_search/visual_search/lib/nets/vgg16.py", line 59, in _caffe2tf_filter
f=self._caffe_weights(name)
File "/home/ubuntu/workspace/visual_search/visual_search/lib/nets/vgg16.py", line 52, in _caffe_weights
return layer['weights']
TypeError: 'Convolution2D' object has no attribute 'getitem'

Hi @zjsjack,
You used wrong pretrained-model :D. I use a pretrained model from this repo (https://github.com/endernewton/tf-faster-rcnn). So you can follow the guide in that repo to know how to get it.
Thanks

I cannot find out how to get the right pretrained-model and weights. Could you explain more on it or show me where to get the model and weights?
In one of my tests, I used http://download.tensorflow.org/models/vgg_16_2016_08_28.tar.gz for the weights and coco_2014_train+coco_2014_valminusminival/vgg16_faster_rcnn_iter_490000.ckpt.data-00000-of-00001 for the model.
With this I got the following error.

Loading caffe weights...
Traceback (most recent call last):
File "index_es.py", line 104, in
sess=sess)
File "/home/ubuntu/workspace/visual_search/visual_search/extractor.py", line 53, in init
tag='default', anchor_scales=anchors)
File "/home/ubuntu/workspace/visual_search/visual_search/lib/nets/vgg16.py", line 493, in create_architecture
self._caffe_layers = pickle.load(f)
EOFError

So I tried to convert the wights from VGG_ILSVRC_16_layers.caffemodel and then got the TypeError

Hi @zjsjack ,
My PC is broken, so I can't check the the weights path you downloaded from. I think the problem is that pretrained model is trained with different version of tensorflow. You should use tensorflow==0.12.1.
By the way, you can download the weights path from here https://drive.google.com/drive/folders/0B1_fAEgxdnvJSmF3YUlZcHFqWTQ (file imagenet_weights.tgz)
In the folder that contains the model path should contains all metadata and index files. You can download coco_900k-1190k.tgz (https://drive.google.com/drive/folders/0B1_fAEgxdnvJNEdDUTRSOU11cW8) and change the model path appropriately.

Now my configuration is:
tensorflow==0.12.1
WEIGHT_PATH=./imagenet_weights/vgg16.weights
MODEL_PATH=./coco_2014_train+coco_2014_valminusminival/vgg16_faster_rcnn_iter_1190000.ckpt

Namespace(es_host='localhost', es_index='im_data', es_port=9200, es_type='obj', input='tf-faster-rcnn/data/demo/', model_path='./coco_2014_train+coco_2014_valminusminival/vgg16_faster_rcnn_iter_1190000.ckpt', weight='./imagenet_weights/vgg16.weights')
Loading caffe weights...
Done!
Loading model check point from ./coco_2014_train+coco_2014_valminusminival/vgg16_faster_rcnn_iter_1190000.ckpt
W tensorflow/core/framework/op_kernel.cc:975] Not found: Key vgg16_default/conv4_3/bias not found in checkpoint
W tensorflow/core/framework/op_kernel.cc:975] Not found: Key vgg16_default/conv2_2/bias not found in checkpoint
...
NotFoundError (see above for traceback): Key vgg16_default/conv4_3/bias not found in checkpoint

Any suggestions?

As I checked, some variable scopes are changed in the new pretrained model so it will throws errors as you see. So you have tried with old pretrained model ( coco_2014_train+coco_2014_valminusminival/vgg16_faster_rcnn_iter_490000). Otherwise you have to adapt your code to new model.

Also tried the 490000, the same error.
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla K80, pci bus id: 0000:00:1e.0)
Namespace(es_host='localhost', es_index='im_data', es_port=9200, es_type='obj', input='tf-faster-rcnn/data/demo/', model_path='./coco_2014_train+coco_2014_valminusminival/vgg16_faster_rcnn_iter_490000.ckpt', weight='./imagenet_weights/vgg16.weights')
Loading caffe weights...
Done!
Loading model check point from ./coco_2014_train+coco_2014_valminusminival/vgg16_faster_rcnn_iter_490000.ckpt
W tensorflow/core/framework/op_kernel.cc:975] Not found: Key vgg16_default/cls_score/bias not found in checkpoint

How does ./coco_2014_train+coco_2014_valminusminival/ folder look like ? Does it contains all metadata files ?

ubuntu@ip-172-31-18-140:~/workspace/visual_search/visual_search$ tree coco_2014_train+coco_2014_valminusminival/
coco_2014_train+coco_2014_valminusminival/
├── vgg16_faster_rcnn_iter_490000.ckpt.data-00000-of-00001
├── vgg16_faster_rcnn_iter_490000.ckpt.index
├── vgg16_faster_rcnn_iter_490000.ckpt.meta
└── vgg16_faster_rcnn_iter_490000.pkl

This error means that variable bias with name (vgg16_default/cls_score/bias) is not found in the check point path. The model you downloaded is different with the model I used when I work on it. I can't access to my PC right now, so I can't share you the model file that I used.
Can you give me output when you run following command. Maybe I can help fix this error.

cat coco_2014_train+coco_2014_valminusminival/vgg16_faster_rcnn_iter_490000.ckpt.index | grep -a conv4

The output is not readable. Just copy&paste as below:

ubuntu@ip-172-31-18-140:~/workspace/visual_search/visual_search$ cat coco_2014_train+coco_2014_valminusminival/vgg16_faster_rcnn_iter_490000.ckpt.index | grep -a conv4
(???5>?? %/Mome? ???
(???5????
4/conv4_1/bias? ???
(?5oI?? /Moment? ???
(?5[5
?%vgg_16/conv4/conv4_1/wei? ???
(???5?Ӆ? %/Mome? ???(???5??N2/bias? ???(?5Y?9 /Moment? ???(?5eS?%wei? ???(???5 խw%/Mome? ?˺(???5??3/bias? ???(?5??? /Moment? ???(?5.ʎ%wei? ???(???5?8?? %/Mome? ??(???5??0?
5/conv5_1/bias? ???"(?5?O? /Moment? ???"(?5?;$#%wei? ???"(???5j mY %/Mome? ???'(???5?9a2/bias? ???+(?5???? /Moment? ???+(?56?|?%vgg_16/conv5/conv5_2/wei? ???+(???52?&d %/Mome? ???0(???5???3/bias? ???4(?5?^? /Moment? ???4(?5̓?z%wei? ???4(???5?tb? %/Mome? ?˻9(???5Pv??

sorry, I point to wrong file.

cat coco_2014_train+coco_2014_valminusminival/vgg16_faster_rcnn_iter_490000.ckpt.meta | grep -a conv4

value0B$vgg_16/conv4/conv4_3/biases/Momentum*
save_1/Assign_38Assign$vgg_16/conv4/conv4_3/biases/Momentumsave_1/RestoreV2_38*
" loc:@vgg_16/conv4/conv4_3/biases*
value(Bvgg_16/conv4/conv4_3/weights*

save_1/Assign_39Assignvgg_16/conv4/conv4_3/weightssave_1/RestoreV2_39*
#!loc:@vgg_16/conv4/conv4_3/weights*
value1B%vgg_16/conv4/conv4_3/weights/Momentum*
save_1/Assign_40Assign%vgg_16/conv4/conv4_3/weights/Momentumsave_1/RestoreV2_40*
#!loc:@vgg_16/conv4/conv4_3/weights*
8vgg_16/conv4/conv4_1/kernel/Regularizer/l2_regularizer:0
8vgg_16/conv4/conv4_2/kernel/Regularizer/l2_regularizer:0
8vgg_16/conv4/conv4_3/kernel/Regularizer/l2_regularizer:0
vgg_16/conv4/conv4_1/weights:0#vgg_16/conv4/conv4_1/weights/Assign#vgg_16/conv4/conv4_1/weights/read:0
vgg_16/conv4/conv4_1/biases:0"vgg_16/conv4/conv4_1/biases/Assign"vgg_16/conv4/conv4_1/biases/read:0
vgg_16/conv4/conv4_2/weights:0#vgg_16/conv4/conv4_2/weights/Assign#vgg_16/conv4/conv4_2/weights/read:0
vgg_16/conv4/conv4_2/biases:0"vgg_16/conv4/conv4_2/biases/Assign"vgg_16/conv4/conv4_2/biases/read:0
vgg_16/conv4/conv4_3/weights:0#vgg_16/conv4/conv4_3/weights/Assign#vgg_16/conv4/conv4_3/weights/read:0
vgg_16/conv4/conv4_3/biases:0"vgg_16/conv4/conv4_3/biases/Assign"vgg_16/conv4/conv4_3/biases/read:0
'vgg_16/conv4/conv4_1/weights/Momentum:0,vgg_16/conv4/conv4_1/weights/Momentum/Assign,vgg_16/conv4/conv4_1/weights/Momentum/read:0
&vgg_16/conv4/conv4_1/biases/Momentum:0+vgg_16/conv4/conv4_1/biases/Momentum/Assign+vgg_16/conv4/conv4_1/biases/Momentum/read:0
'vgg_16/conv4/conv4_2/weights/Momentum:0,vgg_16/conv4/conv4_2/weights/Momentum/Assign,vgg_16/conv4/conv4_2/weights/Momentum/read:0
&vgg_16/conv4/conv4_2/biases/Momentum:0+vgg_16/conv4/conv4_2/biases/Momentum/Assign+vgg_16/conv4/conv4_2/biases/Momentum/read:0
'vgg_16/conv4/conv4_3/weights/Momentum:0,vgg_16/conv4/conv4_3/weights/Momentum/Assign,vgg_16/conv4/conv4_3/weights/Momentum/read:0
&vgg_16/conv4/conv4_3/biases/Momentum:0+vgg_16/conv4/conv4_3/biases/Momentum/Assign+vgg_16/conv4/conv4_3/biases/Momentum/read:0
vgg_16/conv4/conv4_1/weights:0#vgg_16/conv4/conv4_1/weights/Assign#vgg_16/conv4/conv4_1/weights/read:0
vgg_16/conv4/conv4_1/biases:0"vgg_16/conv4/conv4_1/biases/Assign"vgg_16/conv4/conv4_1/biases/read:0
vgg_16/conv4/conv4_2/weights:0#vgg_16/conv4/conv4_2/weights/Assign#vgg_16/conv4/conv4_2/weights/read:0
vgg_16/conv4/conv4_2/biases:0"vgg_16/conv4/conv4_2/biases/Assign"vgg_16/conv4/conv4_2/biases/read:0
vgg_16/conv4/conv4_3/weights:0#vgg_16/conv4/conv4_3/weights/Assign#vgg_16/conv4/conv4_3/weights/read:0
vgg_16/conv4/conv4_3/biases:0"vgg_16/conv4/conv4_3/biases/Assign"vgg_16/conv4/conv4_3/biases/read:0
$TRAIN/vgg_16/conv4/conv4_1/weights:0
#TRAIN/vgg_16/conv4/conv4_1/biases:0
$TRAIN/vgg_16/conv4/conv4_2/weights:0
#TRAIN/vgg_16/conv4/conv4_2/biases:0
$TRAIN/vgg_16/conv4/conv4_3/weights:0
#TRAIN/vgg_16/conv4/conv4_3/biases:0
vgg_16/conv4/conv4_1/weights:0
vgg_16/conv4/conv4_1/biases:0
vgg_16/conv4/conv4_2/weights:0
vgg_16/conv4/conv4_2/biases:0
vgg_16/conv4/conv4_3/weights:0
vgg_16/conv4/conv4_3/biases:0

Unluckly, you have to fix all the variable scope names manually in (https://github.com/tuan3w/visual_search/blob/master/visual_search/lib/nets/vgg16.py) to fit names as above.
for example:

 if cfg.TRAIN.BIAS_DECAY:
      bias = tf.get_variable("bias", initializer=phb, dtype=tf.float32, trainable=trainable)
    else:
      bias = tf.get_variable("bias", initializer=phb, regularizer=tf.no_regularizer, dtype=tf.float32, trainable=trainable)

should change to:

 if cfg.TRAIN.BIAS_DECAY:
      bias = tf.get_variable("biases", initializer=phb, dtype=tf.float32, trainable=trainable)
    else:
      bias = tf.get_variable("biases", initializer=phb, regularizer=tf.no_regularizer, dtype=tf.float32, trainable=trainable)

Does anyone here can share the old pretrained file ?

After manually change the source code, still got the Error as blew:
NotFoundError (see above for traceback): Key vgg16_default/conv3_2/weights not found in checkpoint

Not sure if you have the old pretrained file to share?

Hi @zjsjack,
I used pre-trained version provided by @endernewton at the time I write it. I don't have any copy of it in my machine now. If you can, try to adapt the code with new version of tf-faster-rnn. I used fc7 as image feature .
I don't have much time right now, so I can't fix the problem for you.
If you want to build own image search, you should consider this repo.
Thanks

Hi @zjsjack,
You can get pre-trained model here.
Thanks

Hi, thanks for sharing the model. But one of the .tgz file can not be decompressed.
ubuntu@ip-172-31-18-140:/mnt/train2014$ tar xvf faster_rcnn_models.tgz
faster_rcnn_models/
faster_rcnn_models/coco_2014_train+coco_2014_valminusminival/
faster_rcnn_models/coco_2014_train+coco_2014_valminusminival/default/
faster_rcnn_models/coco_2014_train+coco_2014_valminusminival/default/vgg16_faster_rcnn_iter_490000.ckpt.index
faster_rcnn_models/coco_2014_train+coco_2014_valminusminival/default/vgg16_faster_rcnn_iter_490000.pkl
faster_rcnn_models/coco_2014_train+coco_2014_valminusminival/default/vgg16_faster_rcnn_iter_490000.ckpt.meta
faster_rcnn_models/coco_2014_train+coco_2014_valminusminival/default/vgg16_faster_rcnn_iter_490000.ckpt.data-00000-of-00001

gzip: stdin: invalid compressed data--format violated
tar: Unexpected EOF in archive
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now

And with these partial data, I got the below new error msg.
Loading model check point from /mnt/train2014/faster_rcnn_models/coco_2014_train+coco_2014_valminusminival/default/vgg16_faster_rcnn_iter_490000.ckpt
W tensorflow/core/framework/op_kernel.cc:975] Out of range: Read less bytes than requested
W tensorflow/core/framework/op_kernel.cc:975] Out of range: Read less bytes than requested
W tensorflow/core/framework/op_kernel.cc:975] Out of range: Read less bytes than requested
W tensorflow/core/framework/op_kernel.cc:975] Out of range: Read less bytes than requested
W tensorflow/core/framework/op_kernel.cc:975] Out of range: Read less bytes than requested
W tensorflow/core/framework/op_kernel.cc:975] Out of range: Read less bytes than requested
W tensorflow/core/framework/op_kernel.cc:975] Out of range: Read less bytes than requested
W tensorflow/core/framework/op_kernel.cc:975] Out of range: Read less bytes than requested
W tensorflow/core/framework/op_kernel.cc:975] Data loss: Checksum does not match: stored 1475381895 vs. calculated on the restored bytes 4182060358
W tensorflow/core/framework/op_kernel.cc:975] Data loss: Checksum does not match: stored 3119027912 vs. calculated on the restored bytes 4182060358
W tensorflow/core/framework/op_kernel.cc:975] Data loss: Checksum does not match: stored 4197277905 vs. calculated on the restored bytes 2076928609
W tensorflow/core/framework/op_kernel.cc:975] Data loss: Checksum does not match: stored 2777681086 vs. calculated on the restored bytes 3183351207
W tensorflow/core/framework/op_kernel.cc:975] Data loss: Checksum does not match: stored 3304795247 vs. calculated on the restored bytes 2076928609
W tensorflow/core/framework/op_kernel.cc:975] Out of range: Read less bytes than requested

You can get it now.

Hi, thanks for sharing the model.But I can not find cython_nms.
[root@134 visual_search]# python index_es.py --weight $WEIGHT_PATH --model_path $MODEL_PATH --input $INPUT
Traceback (most recent call last):
File "index_es.py", line 14, in
from extractor import Extractor
File "/usr/local/visual_search_master/visual_search/extractor.py", line 2, in
from model.test import extract_regions_and_feats
File "/usr/local/visual_search_master/visual_search/lib/model/test.py", line 14, in
from utils.cython_nms import nms, nms_new
ImportError: No module named cython_nms

Hi @xiesibo ,
You have to build cython module first.

$ cd visual_search/visual_search/lib/
$ make

Hi @tuan3w ,
Can I run it on platform where there is no GPU? Because I find error when I make it
[root@134 lib]# sudo make
python setup.py build_ext --inplace
running build_ext
skipping 'utils/bbox.c' Cython extension (up-to-date)
skipping 'utils/nms.c' Cython extension (up-to-date)
skipping 'nms/cpu_nms.c' Cython extension (up-to-date)
skipping 'nms/gpu_nms.cpp' Cython extension (up-to-date)
building 'nms.gpu_nms' extension
{'gcc': ['-Wno-unused-function'], 'nvcc': ['-arch=sm_52', '--ptxas-options=-v', '-c', '--compiler-options', "'-fPIC'"]}
/usr/local/cuda/bin/nvcc -I/usr/lib64/python2.7/site-packages/numpy/core/include -I/usr/local/cuda/include -I/usr/include/python2.7 -c nms/nms_kernel.cu -o build/temp.linux-x86_64-2.7/nms/nms_kernel.o -arch=sm_52 --ptxas-options=-v -c --compiler-options '-fPIC'
nvcc fatal : Value 'sm_52' is not defined for option 'gpu-architecture'
error: command '/usr/local/cuda/bin/nvcc' failed with exit status 1

This error means that only extension for gpu cannot be built. So it still works fine with CPU. The author also claim that, see
Can you check if it works:

$ cd visual_search/visual_search/lib/
$ python -c 'from utils.cython_nms import nms, nms_new'

If no exception throws, it's ok