Nanne/pytorch-NetVlad

Running "python main.py --mode=train --arch=vgg16 --pooling=netvlad --num_clusters=64" leads to this error. Help?

taiyipan opened this issue · 3 comments

python main.py --mode=train --arch=vgg16 --pooling=netvlad --num_clusters=64
Namespace(arch='vgg16', batchSize=4, cacheBatchSize=24, cachePath='/home/taiyi/repository2/event_vpr_methods/test_netvlad/cache/', cacheRefreshRate=1000, ckpt='latest', dataPath='/home/taiyi/repository2/event_vpr_methods/test_netvlad/data/', dataset='pittsburgh', evalEvery=1, fromscratch=False, lr=0.0001, lrGamma=0.5, lrStep=5, margin=0.1, mode='train', momentum=0.9, nEpochs=30, nGPU=1, nocuda=False, num_clusters=64, optim='SGD', patience=10, pooling='netvlad', resume='', runsPath='/home/taiyi/repository2/event_vpr_methods/test_netvlad/runs/', savePath='checkpoints', seed=123, split='val', start_epoch=0, threads=16, vladv2=False, weightDecay=0.001)
===> Loading dataset(s)
====> Training query set: 7320
===> Evaluating on val set, query count: 7608
===> Building model
===> Training model
===> Saving state to: /home/taiyi/repository2/event_vpr_methods/test_netvlad/runs/May12_23-37-02_vgg16_netvlad
/home/taiyi/event_vpr/lib/python3.8/site-packages/torch/optim/lr_scheduler.py:139: UserWarning: Detected call of `lr_scheduler.step()` before `optimizer.step()`. In PyTorch 1.1.0 and later, you should call them in the opposite order: `optimizer.step()` before `lr_scheduler.step()`.  Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
  warnings.warn("Detected call of `lr_scheduler.step()` before `optimizer.step()`. "
/home/taiyi/event_vpr/lib/python3.8/site-packages/torch/optim/lr_scheduler.py:152: UserWarning: The epoch parameter in `scheduler.step()` was not necessary and is being deprecated where possible. Please use `scheduler.step()` to step the scheduler. During the deprecation, if epoch is different from None, the closed form is used instead of the new chainable form, where available. Please open an issue if you are unable to replicate your use case: https://github.com/pytorch/pytorch/issues/new/choose.
  warnings.warn(EPOCH_DEPRECATION_WARNING, UserWarning)
====> Building Cache
Allocated: 60039168
/home/taiyi/event_vpr/lib/python3.8/site-packages/torch/cuda/memory.py:416: FutureWarning: torch.cuda.memory_cached has been renamed to torch.cuda.memory_reserved
  warnings.warn(
Cached: 5819596800
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
Traceback (most recent call last):
  File "main.py", line 511, in <module>
    train(epoch)
  File "main.py", line 113, in train
    for iteration, (query, positives, negatives, negCounts, indices) in enumerate(training_data_loader, startIter):
  File "/home/taiyi/event_vpr/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 634, in __next__
    data = self._next_data()
  File "/home/taiyi/event_vpr/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1346, in _next_data
    return self._process_data(data)
  File "/home/taiyi/event_vpr/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1372, in _process_data
    data.reraise()
  File "/home/taiyi/event_vpr/lib/python3.8/site-packages/torch/_utils.py", line 644, in reraise
    raise exception
TypeError: Caught TypeError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/home/taiyi/event_vpr/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop
    data = fetcher.fetch(index)
  File "/home/taiyi/event_vpr/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/taiyi/event_vpr/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 51, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/taiyi/event_vpr/lib/python3.8/site-packages/torch/utils/data/dataset.py", line 298, in __getitem__
    return self.dataset[self.indices[idx]]
  File "/mnt/e6c9de53-1f7f-424a-a71d-7a8cf8e2e0ee/event_vpr_methods/test_netvlad/pittsburgh.py", line 230, in __getitem__
    negFeat = h5feat[negSample.tolist()]
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "/home/taiyi/event_vpr/lib/python3.8/site-packages/h5py/_hl/dataset.py", line 841, in __getitem__
    selection = sel.select(self.shape, args, dataset=self)
  File "/home/taiyi/event_vpr/lib/python3.8/site-packages/h5py/_hl/selections.py", line 82, in select
    return selector.make_selection(args)
  File "h5py/_selector.pyx", line 282, in h5py._selector.Selector.make_selection
  File "h5py/_selector.pyx", line 197, in h5py._selector.Selector.apply_args
TypeError: Indexing arrays must have integer dtypes

Hi there I think the issue may be in getitem function of the QueryDatasetFromStruct class.

In the function h5feat is indexed bye negSample.tolist().

negFeat = h5feat[negSample.tolist()]

negSample.tolist() however is a list of floats.

re-writing this line to:
negFeat = h5feat[list(map(int, negSample))]

appears to solve the issue for me.

@oeg1n18 Thanks! I changed that 1 line of code and now it can run training loops finally.