NVlabs/FasterViT

model in deepcopy will get error.

z1069614715 opened this issue · 12 comments

hello, i try to use train.py and set --model-ema, it will have some error:
RuntimeError: Only Tensors created explicitly by the user (graph leaves) support the deepcopy protocol at the moment
How can I solve this problem? Looking forward to your reply!

Hi @z1069614715 would you please let us know about the timm and torchvision version ?

If possible, it would be great to try in timm=0.6.12 to confirm if this still exists.

We did not face this issue before.

yes, my timm version is 0.6.12,

It doesn't seem to have anything to do with timm

Thanks for confirming. Would you please provide he log as well ?

Traceback (most recent call last):
File "/home/hjj/Desktop/github_code/yolov5-master/models/FasterViT.py", line 1116, in
utils.ModelEmaV2(model)
File "/home/hjj/anaconda3/envs/torch_newest_py38/lib/python3.8/site-packages/timm/utils/model_ema.py", line 108, in init
self.module = deepcopy(model)
File "/home/hjj/anaconda3/envs/torch_newest_py38/lib/python3.8/copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "/home/hjj/anaconda3/envs/torch_newest_py38/lib/python3.8/copy.py", line 270, in _reconstruct
state = deepcopy(state, memo)
File "/home/hjj/anaconda3/envs/torch_newest_py38/lib/python3.8/copy.py", line 146, in deepcopy
y = copier(x, memo)
File "/home/hjj/anaconda3/envs/torch_newest_py38/lib/python3.8/copy.py", line 230, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
File "/home/hjj/anaconda3/envs/torch_newest_py38/lib/python3.8/copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "/home/hjj/anaconda3/envs/torch_newest_py38/lib/python3.8/copy.py", line 296, in _reconstruct
value = deepcopy(value, memo)
File "/home/hjj/anaconda3/envs/torch_newest_py38/lib/python3.8/copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "/home/hjj/anaconda3/envs/torch_newest_py38/lib/python3.8/copy.py", line 270, in _reconstruct
state = deepcopy(state, memo)
File "/home/hjj/anaconda3/envs/torch_newest_py38/lib/python3.8/copy.py", line 146, in deepcopy
y = copier(x, memo)
File "/home/hjj/anaconda3/envs/torch_newest_py38/lib/python3.8/copy.py", line 230, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
File "/home/hjj/anaconda3/envs/torch_newest_py38/lib/python3.8/copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "/home/hjj/anaconda3/envs/torch_newest_py38/lib/python3.8/copy.py", line 296, in _reconstruct
value = deepcopy(value, memo)
File "/home/hjj/anaconda3/envs/torch_newest_py38/lib/python3.8/copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "/home/hjj/anaconda3/envs/torch_newest_py38/lib/python3.8/copy.py", line 270, in _reconstruct
state = deepcopy(state, memo)
File "/home/hjj/anaconda3/envs/torch_newest_py38/lib/python3.8/copy.py", line 146, in deepcopy
y = copier(x, memo)
File "/home/hjj/anaconda3/envs/torch_newest_py38/lib/python3.8/copy.py", line 230, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
File "/home/hjj/anaconda3/envs/torch_newest_py38/lib/python3.8/copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "/home/hjj/anaconda3/envs/torch_newest_py38/lib/python3.8/copy.py", line 296, in _reconstruct
value = deepcopy(value, memo)
File "/home/hjj/anaconda3/envs/torch_newest_py38/lib/python3.8/copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "/home/hjj/anaconda3/envs/torch_newest_py38/lib/python3.8/copy.py", line 270, in _reconstruct
state = deepcopy(state, memo)
File "/home/hjj/anaconda3/envs/torch_newest_py38/lib/python3.8/copy.py", line 146, in deepcopy
y = copier(x, memo)
File "/home/hjj/anaconda3/envs/torch_newest_py38/lib/python3.8/copy.py", line 230, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
File "/home/hjj/anaconda3/envs/torch_newest_py38/lib/python3.8/copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "/home/hjj/anaconda3/envs/torch_newest_py38/lib/python3.8/copy.py", line 296, in _reconstruct
value = deepcopy(value, memo)
File "/home/hjj/anaconda3/envs/torch_newest_py38/lib/python3.8/copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "/home/hjj/anaconda3/envs/torch_newest_py38/lib/python3.8/copy.py", line 270, in _reconstruct
state = deepcopy(state, memo)
File "/home/hjj/anaconda3/envs/torch_newest_py38/lib/python3.8/copy.py", line 146, in deepcopy
y = copier(x, memo)
File "/home/hjj/anaconda3/envs/torch_newest_py38/lib/python3.8/copy.py", line 230, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
File "/home/hjj/anaconda3/envs/torch_newest_py38/lib/python3.8/copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "/home/hjj/anaconda3/envs/torch_newest_py38/lib/python3.8/copy.py", line 296, in _reconstruct
value = deepcopy(value, memo)
File "/home/hjj/anaconda3/envs/torch_newest_py38/lib/python3.8/copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "/home/hjj/anaconda3/envs/torch_newest_py38/lib/python3.8/copy.py", line 270, in _reconstruct
state = deepcopy(state, memo)
File "/home/hjj/anaconda3/envs/torch_newest_py38/lib/python3.8/copy.py", line 146, in deepcopy
y = copier(x, memo)
File "/home/hjj/anaconda3/envs/torch_newest_py38/lib/python3.8/copy.py", line 230, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
File "/home/hjj/anaconda3/envs/torch_newest_py38/lib/python3.8/copy.py", line 153, in deepcopy
y = copier(memo)
File "/home/hjj/anaconda3/envs/torch_newest_py38/lib/python3.8/site-packages/torch/_tensor.py", line 102, in deepcopy
raise RuntimeError(
RuntimeError: Only Tensors created explicitly by the user (graph leaves) support the deepcopy protocol at the moment

i saw train.sh did not using model-ema, You can turn on this parameter(model-ema) and see

Thanks @z1069614715 . I am looking at this issue right now and will get back.

thanks!

Hi @z1069614715 I am yet unable to reproduce this issue on my end. However, it may just make sense to update the training script to support the latest timm version to avoid complications such as this one.

I will push a MR regarding this very soon.

you can try:
from timm import utils
model = faster_vit_0()
utils.ModelEmaV2(model)

Maha59 commented

Hi, I have the same issue. Has Anyone found a solution ?

Hi @z1069614715 and @Maha59

Thanks for raising this issue. I finally got to reproduce this issue. The issue was caused by the following for certain PyTorch releases (mostly older ones):

macs, params = get_model_complexity_info(model, tuple([3, args.input_size[1], args.input_size[2]]),
                                                as_strings=False, print_per_layer_stat=False, verbose=False)

I am not sure what would be the root cause of it (maybe a bug in some older PyTorch releases). However, removing get_model_complexity_info resolves the issue. Since this function is not a core utility and just for providing information regarding model param/flops stats, there should be no issue in removing it from the script. You can also find these model stats in the paper as well.

You can also test it by using the sample train script in which --model-ema has been added.

I hope this addresses the problem !