Error - 'sciann.functionals' has no attribute 'mlp_functional'
Opened this issue · 8 comments
I am getting the following error when I execute a PDE using sciann.
x = sn.Variable('x', dtype=dtype)
tao = sn.Variable('tao', dtype=dtype)
C_n = sn.Functional('C_n', [x,tao], [10,20,40,20,10], act)
PDE = (xdiff(C_n, tao)) - ((2diff(C_n,x) + x*diff(C_n, x, order=2)))
Input In [63] in <cell line: 9>
PDE = (xdiff(C_n, tao)) - ((2diff(C_n,x) + x*diff(C_n, x, order=2)))
File ~\Anaconda3\lib\site-packages\sciann\utils\math.py:1358 in grad
return _gdiff("Grad", f, *args, **kwargs)
File ~\Anaconda3\lib\site-packages\sciann\utils\math.py:1277 in _gdiff
assert is_functional(f), \
File ~\Anaconda3\lib\site-packages\sciann\utils\validations.py:25 in is_functional
if isinstance(f, (sciann.functionals.mlp_functional.MLPFunctional,
AttributeError: module 'sciann.functionals' has no attribute 'mlp_functional'
Please suggest how to solve this.
HI @ssingh-ipa,
I reproduced your case with the following code, but did not receive any error message 🤔 .
Maybe you can give a bit more context of your problem.
import sciann as sn
act = "tanh"
x = sn.Variable('x')
tao = sn.Variable('tao')
C_n = sn.Functional('C_n', [x,tao], [10,20,40,20,10], act)
PDE = (sn.diff(C_n, tao)) - ((2*sn.diff(C_n,x) + x*sn.diff(C_n, x, order=2)))
Hi @linuswalter,
I ran this code snippet in JupyterNotebook and it works without errors. However, I switched to the new version of sciann and ran the same code snippet in Spyder. That's when I started getting this mlp_functional error. I have now switched to the old version, and the code works now.
I am now running into another issue related to the size of my dataset. Please help me with this one. Here is my code snippet:
import sciann as sn
act = "tanh
x = sn.Variable('x', dtype=dtype)
tao = sn.Variable('tao', dtype=dtype)
C_n = sn.Functional('C_n', [x,tao], [10,20,40,20,10], act)
PDE_neg = (x*diff(C_n, tao)) - ((2*diff(C_n,x) + x*diff(C_n, x, order=2)))
IC = (tao==0)*(0.<=x<=1.)*(C_n-1)
BC1 = (x==0)*(tao>0)*diff(C_n,x)
BC2 = (x==1)*(tao>0)*(diff(C_n,x)-norm_Jn) #norm_Jn is an array of size (26362,)
model_n = sn.SciModel([x, tao], [PDE_neg,IC,BC1,BC2], loss_func="mse", optimizer="adam")
tao_value_neg = Dsneg*Time/(Rneg**2) #tao_value_neg is also an array of size (26362,)
normalized_tao_neg = NormalizeData(tao_value_neg).to_numpy()
tao_input_neg = normalized_tao_neg.flatten()
x_input_n = np.linspace(0, 1, 200)
x_input_n_mesh, tao_input_n_mesh = np.meshgrid(
x_input_n,
tao_input_neg
)
x_in_n = np.reshape(x_input_n_mesh, (-1))
tao_in_n = np.reshape(tao_input_n_mesh, (-1))
h_n = model_n.train([x_in_n,tao_in_n],
4*['zero'],
learning_rate=0.001,
epochs=200,
stop_loss_value=1e-10,
reduce_lr_after=15,
stop_lr_value=1e-8,
verbose=1,
batch_size=256,
shuffle=True,
validation_data=None,__
I get the following memory error:
MemoryError: Unable to allocate 1.01 TiB for an array with shape (5272400, 26362) and data type float64.
It is not clear in the documentation how to assign the inputs for training the model. Also, my input files are large files of experimental data. Please suggest how to solve this issue.
@ssingh-ipa, I think you should prepare your target data differently.
In BC2 = (x==1)*(tao>0)*(diff(C_n,x)-norm_Jn)
you could pass norm_Jn
if it is a scalar value, but if you pass an array, it is not clear at which exact point of tao
you assign which target value for you Functional C_n
.
Also, your input data x_input_n
need to be randomly sampled in your domain, don't pass them in sorted order to the NN.
The best is to check the SciANN DataGenerator to understand the exact data structures that you need.
You can follow the application example about Mandel's problem at which the DataGenerator is applied.
@linuswalter Thank you. x_input_n is a normalized value, so yes, I can try to sample it randomly as done in Mandel's problem.
However, norm_Jn is a function of a parameter (Current), which is a time series input. Hence the physical meaning of BC2 is that at x=1 and tao>0,
dC/dx = I(t)*constants
This is a Neumann boundary condition, and I couldn't find similar examples for reference. What would you suggest for this?
@ssingh-ipa ah, I think I understand your problem better now. Well if norm_Jn
is a function of t
, then you can assign a sn.Functional
to it and train it on your dataset (the one with size 26362). After this first training step, you set it norm_Jn.trainable(False)
and train your main model model_n
.
Something like this:
import sciann as sn
act = "tanh
x = sn.Variable('x', dtype=dtype)
tao = sn.Variable('tao', dtype=dtype)
norm_Jn = sn.Functional('norm_Jn', [tao], 4*[10], "tanh")
C_n = sn.Functional('C_n', [x,tao], [10,20,40,20,10], act)
PDE_neg = (x*diff(C_n, tao)) - ((2*diff(C_n,x) + x*diff(C_n, x, order=2)))
IC = (tao==0)*(0.<=x<=1.)*(C_n-1)
BC1 = (x==0)*(tao>0)*diff(C_n,x)
BC2 = (x==1)*(tao>0)*(diff(C_n,x)-norm_Jn) #norm_Jn is an array of size (26362,)
targets_norm_Jn = norm_Jn
mod_norm_Jn = sn.SciModel([tao],[targets_norm_Jn], optimizer="adam",loss_func="mse")
model_n = sn.SciModel([x, tao], [PDE_neg,IC,BC1,BC2], loss_func="mse", optimizer="adam")
tao_value_neg = Dsneg*Time/(Rneg**2) #tao_value_neg is also an array of size (26362,)
normalized_tao_neg = NormalizeData(tao_value_neg).to_numpy()
tao_input_neg = normalized_tao_neg.flatten()
x_input_n = np.linspace(0, 1, 200)
x_input_n_mesh, tao_input_n_mesh = np.meshgrid(
x_input_n,
tao_input_neg
)
x_in_n = np.reshape(x_input_n_mesh, (-1))
tao_in_n = np.reshape(tao_input_n_mesh, (-1))
norm_Jn.set_trainable(True)
mod_norm_Jn.compile()
H_norm_Jn = mod_norm_Jn.train([tao_in_n], target_values_norm_Jn)
norm_Jn.set_trainable(False)
mod_norm_Jn.compile()
H_C_n = model_n.train(...)
Well, the values for target_values_norm_Jn
are probably non-zero, right? So you have to pass in a certain data structure. Example:
target_values_norm_Jn = [(Array_of_indices,Array_of_target_values)]
The shape of Array_of_indices
and Array_of_target_values
is identical.
The Array_of_indices
is dtype=int64
and contains the indices of the input dataset tao_in_n
, at which each corresponding target value in Array_of_target_values
is assigned.
The following example is from the data generator in 1D with 4 different targets. Please note, that each column of the Input Dataset
is in reality an extra array in the input_list
and needs to have the shape (-1,1)
. I have just put in a Pandas DataFrame for visualization purposes.
In your case, you have to assign values different than zero
in Array_of_target_values
.
Hope my description helps you a little bit.
I also really recommend you to understand the DataGenerator and to adapt the code to your needs.
Best, Linus
Hi Linus @linuswalter,
This definitely helps. Thanks a lot.
I have 2 open points:
-
You trained mod_norm_Jn with the input data tao_in_n in a separate network called H_norm_Jn. But then how is H_norm_Jn integrated with the final network- H_C_n? Does that happen automatically with respect to tao_in_n?
-
Secondly, as you know, tao_in_n is reshaped from a meshgrid. This creates a length discrepancy, i.e, the len(Array_of_target_values) is 86596 while that of len(tao_in_n) = 17319200. Should I use (x_input_n_mesh, tao_input_n_mesh) as the inputs for training the model? To phrase it otherwise, is the 1D reshape not useful anymore?
x_in_n = np.reshape(x_input_n_mesh, (-1))
tao_in_n = np.reshape(tao_input_n_mesh, (-1))
The meshgrid issue will be solved once I integrate the Datagenerator in my code. For reference, are there any examples where XT data generator is implemented?
Best,
Soumya
Hi Soumya,
for the late reply.
- Actually,
norm_Jn
is the neural network. We train it in the model setupmod_norm_Jn
. Then, we can use the trained NNnorm_Jn
in the next modeling setupmod_n
in which we want to train your main NNC_n
. - I am not sure if I understand this correctly. I think the general data structure that you create for
(x_input_n_mesh, tao_input_n_mesh)
via 1D reshape is correct. But for the array withlen(Array_of_target_values)
= 86596 you need to create an extra input array of collocation points. Please let me know if you were able to solve your problem.
Actually, an alternative to representing norm_Jn
by an extra NN is to change the definition of your loss term BC2
.
Your current definition is
which is written in the code via
BC2 = (x==1)*(tao>0)*(diff(C_n,x)-norm_Jn)
Instead, you could define
which would be implemented via
BC2 = (x==1)*(tao>0)*(diff(C_n,x))
and assign the array norm_Jn
as targets. That means you need to create input arrays/collocation points that respect x=0
and t>0
. It is important that you assign an input value at the array x
and t
for each target value norm_Jn
. So instead of assigning zeros
, you assign the respective values of norm_Jn
.
Actually, the application Example of Terzaghi uses the DataGeneratorXT: https://github.com/sciann/sciann-applications/tree/master/SciANN-PoroElasticity
I hope this helps you a bit.
Please let me know if something wasn't clear.
Best,
Linus
Hi Linus,
The first approach (Separate network for norm_Jn) has definitely improved my model prediction. However, it is not optimal enough for predicting C_n. Some context here: I am training the PINN to predict the change in concentration across the electrodes of a Li-ion battery. This is done through the 2nd Fick's diffusion law and its respective Neumann boundary conditions. norm_Jn is the flux density.
Please find the updated code below:
----------------------- Neural Network Setup -----------------------
sn.reset_session()
sn.set_random_seed(1234)
x = sn.Variable('x', dtype=dtype)
tao = sn.Variable('tao', dtype=dtype)
norm_Jn = sn.Functional('norm_Jn', [tao], 4*[10], 'sigmoid')
C_n = sn.Functional('C_n', [x,tao], 8*[20], act)
PDE_neg = (x*diff(C_n, tao)) - ((2*diff(C_n,x) + x*diff(C_n, x, order=2)))
IC = (tao==0)*(0.<=x<=1.)*(C_n-1)
BC1 = (x==0)*(tao>0)*diff(C_n,x)
BC2 = (x==1)*(tao>0)*(diff(C_n,x)-norm_Jn)
targets_norm_Jn = norm_Jn
targets_PDE = [sn.PDE(PDE_neg), IC, BC1, BC2]
model_pred = sn.SciModel([x, tao], [norm_Jn, C_n])
mod_norm_Jn = sn.SciModel([tao],[targets_norm_Jn], optimizer="adam",loss_func="mse")
model_n = sn.SciModel([x, tao], targets_PDE, loss_func="mse", optimizer="adam")
tao_value_neg = Dsneg*Time/(Rneg**2)
normalized_tao_neg = NormalizeData(tao_value_neg).to_numpy()
tao_input_neg = normalized_tao_neg
Array_of_indices= np.arange(len(tao_input_neg)).astype(np.int64)
Array_of_target_values=normalized_J_n
target_values_norm_Jn = [(Array_of_indices,Array_of_target_values)]
norm_Jn.set_trainable(True)
mod_norm_Jn.compile()
# In[13]:
#----training parameters-------
NUM_SAMPLES = 100000
BATCH_SIZE = 1000 # higher batch size results in more accuracy
BATCH_SIZE_Jn = 300
EPOCHS_PDE = 500 # make sure (NUM_SAMPLES/BATCH_SIZE)*EPOCHS > 50K (total gradient updates)
EPOCHS_Jn = 500
STOP_AFTER = None
ADAPTIVE_WEIGHTS = {'method': 'GN', 'freq':300, 'use_score':True, 'alpha':1.0}
ADAPTIVE_WEIGHTS_Jn = {'method': 'NTK', 'freq':200}
initial_lr = 1e-3
final_lr = initial_lr/100
learning_rate_PDE = {
"scheduler": "ExponentialDecay",
"initial_learning_rate": initial_lr,
"final_learning_rate": final_lr,
"decay_epochs": EPOCHS_PDE
}
learning_rate_Jn = {
"scheduler": "ExponentialDecay",
"initial_learning_rate": initial_lr,
"final_learning_rate": final_lr,
"decay_epochs": EPOCHS_Jn
}
my_callback_J = tf.keras.callbacks.EarlyStopping(monitor='loss',patience=15,verbose=2)
my_callback = tf.keras.callbacks.EarlyStopping(monitor='loss',patience=30,verbose=2)
H_norm_Jn = mod_norm_Jn.train([tao_input_neg],
target_values_norm_Jn,
learning_rate=learning_rate_Jn,
epochs=EPOCHS_Jn,
callbacks=[my_callback_J],
batch_size=BATCH_SIZE_Jn,
stop_loss_value=1e-10,
#reduce_lr_after=15,
stop_lr_value=1e-8,
adaptive_weights=ADAPTIVE_WEIGHTS_Jn,
verbose=1
)
norm_Jn.set_trainable(False)
mod_norm_Jn.compile()
td_0 = 0.0
td_f = len(tao_input_neg)
xd_min = 0.0
xd_max = 1.0
dg_target = DataGeneratorXT(
X=[xd_min,xd_max], T=[td_0,td_f],
num_sample=NUM_SAMPLES,
targets=['domain', 'ic', 'bc-left', 'bc-right']
)
input_data_target, target_data_PDE = dg_target.get_data()
dg_target.plot_data()
H_C_n = model_n.train(input_data_target,
target_data_PDE,
learning_rate=learning_rate_PDE,
epochs=EPOCHS_PDE,
callbacks=[my_callback],
stop_loss_value=1e-9,
stop_lr_value=1e-8,
verbose=2,
batch_size=BATCH_SIZE,
adaptive_weights=ADAPTIVE_WEIGHTS
)
Now, when I run the NN norm_Jn separately, i.e., without C_n, the prediction is quite good as shown below:
However, when norm_Jn is evaluated with respect to the above-mentioned code, the prediction degrades, adversely affecting my PINN's accuracy. The new prediction is shown below:
What could be the reason for this?
Alternative:
I also tried the alternative you mentioned, i.e., assigning the respective values of norm_Jn. However, my data generator is 2D wrt x and t and when the BC2 is trained outside of these collocation points, my code does not execute anymore.
sn.reset_session()
sn.set_random_seed(1234)
x = sn.Variable('x', dtype=dtype)
tao = sn.Variable('tao', dtype=dtype)
C_n = sn.Functional('C_n', [x,tao], 8*[20], act)
PDE_neg = (x*diff(C_n, tao)) - ((2*diff(C_n,x) + x*diff(C_n, x, order=2)))
IC = (tao==0)*(0.<=x<=1.)*(C_n-1)
BC1 = (x==0)*(tao>0)*diff(C_n,x)
BC2 = (x==1)*(tao>0)*diff(C_n,x)
targets_PDE = [sn.PDE(PDE_neg), IC, BC1]
model_n = sn.SciModel([x, tao], [sn.PDE(PDE_neg),IC,BC1,BC2], loss_func="mse", optimizer="adam")
#model_n = sn.SciModel([x, tao], [targets_PDE,BC2], loss_func="mse", optimizer="adam")
td_0 = 0.0
td_f = len(tao_input_neg)
xd_min = 0.0
xd_max = 1.0
dg_target = DataGeneratorXT(
X=[xd_min,xd_max], T=[td_0,td_f],
num_sample=NUM_SAMPLES,
targets=['domain', 'ic', 'bc-left', 'bc-right']
)
input_data_target, target_data_PDE = dg_target.get_data()
dg_target.plot_data()
H_C_n = model_n.train(input_data_target,
[target_data_PDE,Array_of_target_values],
learning_rate=learning_rate_PDE,
epochs=EPOCHS_PDE,
callbacks=[my_callback],
stop_loss_value=1e-9,
stop_lr_value=1e-8,
verbose=2,
batch_size=BATCH_SIZE,
adaptive_weights=ADAPTIVE_WEIGHTS
)
I get the following error:
--> 351 assert len(y_true)==len(self._constraints),
352 'Miss-match between expected targets (constraints) defined in SciModel
and '
353 'the provided y_true
s - expecting the same number of data points. '
355 num_sample = x_true[0].shape[0]
356 assert all([x.shape[0]==num_sample for x in x_true[1:]]),
357 'Inconsistent sample size among Xs
. '
AssertionError: Miss-match between expected targets (constraints) defined in SciModel
and the provided y_true
s - expecting the same number of data points.
The alternative seems simpler but I am not able to get this working. What do you think?
Best,
Soumya