学习使用fourier-deeponet-fwi,使用data_gen_loc生成fvb中的data,并将data.py中的num_dataset改为8之后使用train.py进行训练,但是会报错
Closed this issue · 3 comments
代码按照上面修改过,使用的Windows环境下torch 2.0.0+cu118;numpy 1.26.4;DeepXDE 1.12.2;
我把生成的data1和openfwi上下载对应的data1发现是一致的,便使用data_gen_loc生成的数据;
其中还将路径修改为windows环境读取的路径
报错信息如下:
Using backend: pytorch
Other supported backends: tensorflow.compat.v1, tensorflow, jax, paddle.
paddle supports more examples now and is recommended.
X_train_branch shape: (4000, 5, 1000, 70), X_train_trunk shape: (4000, 5)
X_test_branch shape: (50, 5, 1000, 70), X_test_trunk shape: (50, 5)
y_train shape: (4000, 1, 70, 70), y_test shape: (50, 1, 70, 70)
Compiling model...
'compile' took 0.000319 s
Training model...
Step Train loss Test loss Test metric
0 [5.16e-01] [4.72e-01] [4.72e-01, 5.62e-01]
Traceback (most recent call last):
File "i:\ai4Science\models\fourier-deeponet-fwi-main\src\train.py", line 72, in
main(dataset='fvb', task='loc')
File "i:\ai4Science\models\fourier-deeponet-fwi-main\src\train.py", line 65, in main
losshistory, train_state = model.train(iterations=100000, batch_size=32, display_every=100, callbacks=[checker])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Users\23905\anaconda3\envs\pytorch\Lib\site-packages\deepxde\utils\internal.py", line 22, in wrapper
result = f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^
File "D:\Users\23905\anaconda3\envs\pytorch\Lib\site-packages\deepxde\model.py", line 657, in train
self._train_sgd(iterations, display_every)
File "D:\Users\23905\anaconda3\envs\pytorch\Lib\site-packages\deepxde\model.py", line 675, in _train_sgd
self._train_step(
File "D:\Users\23905\anaconda3\envs\pytorch\Lib\site-packages\deepxde\model.py", line 567, in _train_step
self.train_step(inputs, targets, auxiliary_vars)
File "D:\Users\23905\anaconda3\envs\pytorch\Lib\site-packages\deepxde\model.py", line 364, in train_step
File "D:\Users\23905\anaconda3\envs\pytorch\Lib\site-packages\deepxde\model.py", line 364, in train_step
self.opt.step(closure)
File "D:\Users\23905\anaconda3\envs\pytorch\Lib\site-packages\torch\optim\lr_scheduler.py", line 69, in wrapper
return wrapped(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Users\23905\anaconda3\envs\pytorch\Lib\site-packages\torch\optim\optimizer.py", line 280, in wrapper
out = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "D:\Users\23905\anaconda3\envs\pytorch\Lib\site-packages\torch\optim\optimizer.py", line 33, in _use_grad
ret = func(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Users\23905\anaconda3\envs\pytorch\Lib\site-packages\torch\optim\adam.py", line 141, in step
adam(
File "D:\Users\23905\anaconda3\envs\pytorch\Lib\site-packages\torch\optim\adam.py", line 281, in adam
func(params,
File "D:\Users\23905\anaconda3\envs\pytorch\Lib\site-packages\torch\optim\adam.py", line 442, in _multi_tensor_adam
device_grads = torch._foreach_add(device_grads, device_params, alpha=weight_decay)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: The size of tensor a (2) must match the size of tensor b (20) at non-singleton dimension 4
由于在train之后便报了库中的问题,所以想请教一下解决办法,万分感激
将num_dataset更改为48后训练仍然报了相同的错误,不知是否与库的版本有关
Please ask the question at https://github.com/lu-group/fourier-deeponet-fwi
sorry, is due to an issue with versions between different libraries, now it's solved, thanks