VAE-GAN 代码直接运行报错

Question

VAE-GAN 代码直接运行报错

Opened this issue 10 months ago · 3 comments

我的环境是torch==1.8.1，跑VAE例程或者其他例程直接可以跑，暂时没遇到问题，就是跑这个例程报错。

直接运行你的代码，报错如下：

RuntimeError Traceback (most recent call last)
Cell In[9], line 204
202 output = D(recon_data)
203 errVAE = criterion(output, real_label)
--> 204 errVAE.backward()
205 D_G_z2 = output.mean().item()
206 optimizerVAE.step()

File d:\ProgramData\Anaconda3\envs\pt\lib\site-packages\torch\tensor.py:245, in Tensor.backward(self, gradient, retain_graph, create_graph, inputs)
236 if has_torch_function_unary(self):
237 return handle_torch_function(
238 Tensor.backward,
239 (self,),
(...)
243 create_graph=create_graph,
244 inputs=inputs)
--> 245 torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)

File d:\ProgramData\Anaconda3\envs\pt\lib\site-packages\torch\autograd_init_.py:145, in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables, inputs)
142 if retain_graph is None:
143 retain_graph = create_graph
--> 145 Variable.execution_engine.run_backward(
146 tensors, grad_tensors, retain_graph, create_graph, inputs,
147 allow_unreachable=True, accumulate_grad=True)

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [16, 1, 4, 4]] is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

Answer 1 · 2023-12-29T11:03:24.000Z

I'm having the same problem

Answer 2 · 2023-12-29T14:30:43.000Z

I solved it by changing line 202 in the training loop.

output = D(recon_data.detach())
Here, recon_data is the output of the VAE's decoder, and detaching it from the computation graph is done to prevent gradients from being computed with respect to recon_data.

Answer 3 · 2024-01-02T06:23:06.000Z

I solved it by changing line 202 in the training loop.我通过在训练循环中更改第 202 行来解决它。

output = D(recon_data.detach())输出 = D（recon_data.detach（））
Here, recon_data is the output of the VAE's decoder, and detaching it from the computation graph is done to prevent gradients from being computed with respect to recon_data.这里，recon_data是 VAE 解码器的输出，将其与计算图分离以防止计算相对于recon_data的梯度。

Thank you very much