blendmvs high_res

Question

blendmvs high_res

Opened this issue 2 years ago · 9 comments

你好，我想请问下blendmvs high_res数据集链接好像无法下载，有没有其他方法下载

Answer 1 · 2023-03-28T12:03:04.000Z

之前有同学也反应这个问题，似乎多次尝试才终于能够下载...具体情况我们也不太清楚

Answer 2 · 2023-03-29T08:33:19.000Z

感谢你的回应。
当我在训练MVSFormer（Twins-based）的时候能够正常进入loss迭代界面，但是在训练MVSFormer（frozen DINO-based）的时候，出现了报错：
File "/home/aszitao/anaconda3/envs/mvsformer/lib/python3.7/site-packages/torch/cuda/amp/grad_scaler.py", line 137, in _check_scale_growth_tracker
assert self._scale is not None, "Attempted {} but _scale is None. ".format(funcname) + fix
AssertionError: Attempted step but _scale is None. This may indicate your script did not use scaler.scale(loss or outputs) earlier in the iteration.请问我应该如何修改。
我的环境是python 3.7 pytorch1.9.0+cu111 3090*1。为了训练,我将n_gpu,batchsize修改为1，CUDA_VISIBLE_DEVICES=0,1->0, 并将from torch._six import container_abcs 修改成import collections.abc as container_abcs（为了解决cannot import name ‘container_abcs’ from torch._six’的报错）。

Answer 3 · 2023-03-29T08:37:25.000Z

是否开启了DDP，如果开启试着关掉看看？这个情况根据我的经验是出现了loss为None或者跳过了训练某些步骤造成的。

Answer 4 · 2023-05-19T14:26:34.000Z

请问是否有blendedmvs的评估程序

Answer 5 · 2023-06-21T02:22:00.000Z

感谢你的回应。当我在训练MVSFormer（Twins-based）的时候能够正常进入loss迭代界面，但是在训练MVSFormer（frozen DINO-based）的时候，出现了报错： File "/home/aszitao/anaconda3/envs/mvsformer/lib/python3.7/site-packages/torch/cuda/amp/grad_scaler.py", line 137, in _check_scale_growth_tracker assert self._scale is not None, "Attempted {} but _scale is None. ".format(funcname) + fix AssertionError: Attempted step but _scale is None. This may indicate your script did not use scaler.scale(loss or outputs) earlier in the iteration.请问我应该如何修改。我的环境是python 3.7 pytorch1.9.0+cu111 3090*1。为了训练,我将n_gpu,batchsize修改为1，CUDA_VISIBLE_DEVICES=0,1->0, 并将from torch._six import container_abcs 修改成import collections.abc as container_abcs（为了解决cannot import name ‘container_abcs’ from torch._six’的报错）。

你可以试着把nccl后端改为gloo，或者如果你的电脑只有一个显卡的话就关掉DDP。我的电脑是多卡，然后用的gloo后端，虽然最后会报错raise EOFError，但是不影响模型生成。

Answer 6 · 2023-06-30T02:20:40.000Z

感谢你的回应。当我在训练MVSFormer（Twins-based）的时候能够正常进入loss迭代界面，但是在训练MVSFormer（frozen DINO-based）的时候，出现了报错： File "/home/aszitao/anaconda3/envs/mvsformer/lib/python3.7/site-packages/torch/cuda/amp/grad_scaler.py", line 137, in _check_scale_growth_tracker assert self._scale is not None, "Attempted {} but _scale is None. ".format(funcname) + fix AssertionError: Attempted step but _scale is None. This may indicate your script did not use scaler.scale(loss or outputs) earlier in the iteration.请问我应该如何修改。我的环境是python 3.7 pytorch1.9.0+cu111 3090*1。为了训练,我将n_gpu,batchsize修改为1，CUDA_VISIBLE_DEVICES=0,1->0, 并将from torch._six import container_abcs 修改成import collections.abc as container_abcs（为了解决cannot import name ‘container_abcs’ from torch._six’的报错）。

您好，我配置的环境与你的一样，并且在3090*4上训练，但在训练MVSFormer（Twins-based）时，遇到warning:/data/hkk/anaconda3/envs/mvs/lib/python3.7/site-packages/torch/optim/lr_scheduler.py:134: UserWarning: Detected call of lr_scheduler.step() before optimizer.step(). In PyTorch 1.1.0 and later, you should call them in the opposite order: optimizer.step() before lr_scheduler.step(). Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
然后在正常迭代1200/6775时，程序会突然报错中断，显示：

Answer 7 · 2023-07-05T02:00:43.000Z

感谢你的回应。当我在训练MVSFormer（Twins-based）的时候能够正常进入loss迭代界面，但是在训练MVSFormer（frozen DINO-based）的时候，出现了报错： File "/home/aszitao/anaconda3/envs/mvsformer/lib/python3.7/site-packages/torch/cuda/amp/grad_scaler.py", line 137, in _check_scale_growth_tracker assert self._scale is not None, "Attempted {} but _scale is None. ".format(funcname) + fix AssertionError: Attempted step but _scale is None. This may indicate your script did not use scaler.scale(loss or outputs) earlier in the iteration.请问我应该如何修改。我的环境是python 3.7 pytorch1.9.0+cu111 3090*1。为了训练,我将n_gpu,batchsize修改为1，CUDA_VISIBLE_DEVICES=0,1->0, 并将from torch._six import container_abcs 修改成import collections.abc as container_abcs（为了解决cannot import name ‘container_abcs’ from torch._six’的报错）。

你可以试着把nccl后端改为gloo，或者如果你的电脑只有一个显卡的话就关掉DDP。我的电脑是多卡，然后用的gloo后端，虽然最后会报错raise EOFError，但是不影响模型生成。

您好，我在3090*4上训练，开启了DDP。我按照您的方法，将train.py中第30行的nccl后端更改为gloo，还是报相同的错误。需要怎么改进呢？

Answer 8 · 2023-07-05T02:08:43.000Z

感谢你的回应。当我在训练MVSFormer（Twins-based）的时候能够正常进入loss迭代界面，但是在训练MVSFormer（frozen DINO-based）的时候，出现了报错： File "/home/aszitao/anaconda3/envs/mvsformer/lib/python3.7/site-packages/torch/cuda/amp/grad_scaler.py", line 137, in _check_scale_growth_tracker assert self._scale is not None, "Attempted {} but _scale is None. ".format(funcname) + fix AssertionError: Attempted step but _scale is None. This may indicate your script did not use scaler.scale(loss or outputs) earlier in the iteration.请问我应该如何修改。我的环境是python 3.7 pytorch1.9.0+cu111 3090*1。为了训练,我将n_gpu,batchsize修改为1，CUDA_VISIBLE_DEVICES=0,1->0, 并将from torch._six import container_abcs 修改成import collections.abc as container_abcs（为了解决cannot import name ‘container_abcs’ from torch._six’的报错）。

你可以试着把nccl后端改为gloo，或者如果你的电脑只有一个显卡的话就关掉DDP。我的电脑是多卡，然后用的gloo后端，虽然最后会报错raise EOFError，但是不影响模型生成。

您好，我在3090*4上训练，开启了DDP。我按照您的方法，将train.py中第30行的nccl后端更改为gloo，还是报相同的错误。需要怎么改进呢？

看你的报错，好像是dataloader读取dtu数据集的时候出问题了，估计是你的数据集没完整下载，还有你的pytorch1.1有点老了
实在不行的话建议换个环境试试，我用的是python3.8, pytorch1.11+cu113, numpy1.23.1

Answer 9 · 2023-07-05T06:53:50.000Z

感谢你的回应。当我在训练MVSFormer（Twins-based）的时候能够正常进入loss迭代界面，但是在训练MVSFormer（frozen DINO-based）的时候，出现了报错： File "/home/aszitao/anaconda3/envs/mvsformer/lib/python3.7/site-packages/torch/cuda/amp/grad_scaler.py", line 137, in _check_scale_growth_tracker assert self._scale is not None, "Attempted {} but _scale is None. ".format(funcname) + fix AssertionError: Attempted step but _scale is None. This may indicate your script did not use scaler.scale(loss or outputs) earlier in the iteration.请问我应该如何修改。我的环境是python 3.7 pytorch1.9.0+cu111 3090*1。为了训练,我将n_gpu,batchsize修改为1，CUDA_VISIBLE_DEVICES=0,1->0, 并将from torch._six import container_abcs 修改成import collections.abc as container_abcs（为了解决cannot import name ‘container_abcs’ from torch._six’的报错）。

你可以试着把nccl后端改为gloo，或者如果你的电脑只有一个显卡的话就关掉DDP。我的电脑是多卡，然后用的gloo后端，虽然最后会报错raise EOFError，但是不影响模型生成。

您好，我在3090*4上训练，开启了DDP。我按照您的方法，将train.py中第30行的nccl后端更改为gloo，还是报相同的错误。需要怎么改进呢？

看你的报错，好像是dataloader读取dtu数据集的时候出问题了，估计是你的数据集没完整下载，还有你的pytorch1.1有点老了实在不行的话建议换个环境试试，我用的是python3.8, pytorch1.11+cu113, numpy1.23.1

您好，我配置的环境是python3.7+pytorch1.9.0+cu111+numpy1.20.1。有关数据集的问题已经解决了（解压时出现了3张错误的图像），但是接下来在训练MVSFormer (Twins-based)时遇到了一些warning：

请问您知道这是什么原因吗？会影响训练结果吗？
感谢