airsplay/py-bottom-up-attention

ValueError cannot reshape array into shape when loading in generated COCO features

rubencart opened this issue · 0 comments

Instructions To Reproduce the Issue

We extracted features of the COCO train2017 split with the detectron2_mscoco_proposal_maxnms.py script. This completed without errors.

Afterwards we try to read in the features from disk with the following function.

AIRSPLAY_FIELDNAMES = ['img_id', 'img_w', 'img_h', 'objects_id', 'objects_conf', 'attrs_id', 'attrs_conf',
                       'num_boxes', 'boxes', 'features']

def read_airsplay_tsv(infile, year='2017'):
    data = {}
    with open(infile, "r") as tsv_in_file:
        reader = csv.DictReader(tsv_in_file, delimiter='\t', fieldnames=AIRSPLAY_FIELDNAMES)
        for item in tqdm(reader):
            data_item = {}
            data_item['image_id'] = int(item['img_id']) if year == '2017' else int(item['img_id'].split('_')[-1])
            data_item['image_h'] = int(item['img_h'])
            data_item['image_w'] = int(item['img_w'])
            data_item['num_boxes'] = int(item['num_boxes'])
            for field, dtype in [('boxes', np.float32),
                                 ('features', np.float32),
                                 ('objects_id', np.int64),
                                 ('objects_conf', np.float32)]:
                feature = np.frombuffer(base64.b64decode(item[field]), dtype=dtype)
                feature = feature.reshape((data_item['num_boxes'], -1))
                data_item[field] = feature
            data[data_item['image_id']] = data_item
    return data

This gives the following error in iteration 12663:

Traceback (most recent call last):
  File "/cw/liir/NoCsBack/testliir/rubenc/miniconda3/envs/vpcfg_env/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3441, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-3-067698b32ccd>", line 1, in <module>
    data2 = read_airsplay_tsv('/cw/liir/NoCsBack/testliir/rubenc/py-bottom-up-attention/data/mscoco_imgfeat/train2017_d2obj36_batch_2.tsv', year='2017')
  File "/cw/liir/NoCsBack/testliir/rubenc/vpcfg-dev/latent-structure-tools/common_tools/tools/read_tsv.py", line 58, in read_airsplay_tsv
    feature = feature.reshape((data_item['num_boxes'], -1))
ValueError: cannot reshape array of size 73728 into shape (35,newaxis)

The problem seems to be that 35, and not 36 boxes have been extracted for this image, but that the dimensions of the features do not correspond. The 'num_boxes' field equals 35, the 'boxes' field can be reshaped to (35, 4), but the 'features' field cannot be reshaped to (35, -1), it can however be reshaped to (36, -1).

Reading in the generated features for the val2017, val2014 and train2014 splits in the same way does work without errors.

How can we solve this (how can I make the detectron2_mscoco_proposal_maxnms.py script correctly save 36 boxes per image)?
If we reshape the feature to (36, -1) instead of (35, -1) and only use the first 35 rows, will these correctly correspond to the 35 saved boxes?

EDIT: same problem after running the script with MIN_BOXES = 10, MAX_BOXES = 100.

Environment

Please paste the output of python -m detectron2.utils.collect_env, or use python detectron2/utils/collect_env.py if detectron2 hasn't been successfully installed.


sys.platform linux
Python 3.7.0 (default, Oct 9 2018, 10:31:47) [GCC 7.3.0]
Numpy 1.21.2
Detectron2 Compiler GCC 7.5
Detectron2 CUDA Compiler 11.4
DETECTRON2_ENV_MODULE
PyTorch 1.4.0
PyTorch Debug Build False
torchvision 0.5.0
CUDA available True
GPU 0 NVIDIA GeForce RTX 2080 Ti
GPU 1,2 NVIDIA TITAN Xp
GPU 3 NVIDIA GeForce GTX 1080 Ti
CUDA_HOME /usr/local/cuda
NVCC Build cuda_11.4.r11.4/compiler.30300941_0
Pillow 8.3.2
cv2 4.5.3


PyTorch built with:

  • GCC 7.3
  • Intel(R) Math Kernel Library Version 2019.0.4 Product Build 20190411 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v0.21.1 (Git Hash 7d2fd500bc78936d1d648ca713b901012f470dbc)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • NNPACK is enabled
  • CUDA Runtime 10.1
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
  • CuDNN 7.6.3
  • Magma 2.5.1
  • Build settings: BLAS=MKL, BUILD_NAMEDTENSOR=OFF, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Wno-stringop-overflow, DISABLE_NUMA=1, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF,