learningmatter-mit/NeuralForceField

Energy data set generation

mukui123 opened this issue · 4 comments

Hello, I tried to make my own data set to check the network training effect. The training effect of energy is very bad, which is three orders of magnitude lower than the original ethanol energy. The energy I used is the total energy produced by LAMMPS in each step of training water molecules.May I ask what is the specific energy of the ethanol data set provided by Wu Jie?
image
image

Hey @mukui123 - sorry to hear that the training isn't going well. My guess is that it has something to do with units. What are the units of energies, forces, and coordinates? The force units should be the energy units divided by the coordinate units, in order for training to work. One way you can test this hypothesis is by setting the energy loss coefficient to 0 and seeing if force training improves. (Because if you're only predicting one quantity then the units don't matter.)

As for Wujie's dataset, I believe that it's data from the MD17 benchmark.

Let me know if this helps!

Also, you said the energy is three orders of magnitude lower than the true result. On the graph they seem to be similar orders of magnitude? But if they are very different, I'd suggest subtracting the mean energy from the data before training. That can have a big impact on performance.

Also, you said the energy is three orders of magnitude lower than the true result. On the graph they seem to be similar orders of magnitude? But if they are very different, I'd suggest subtracting the mean energy from the data before training. That can have a big impact on performance.

Thanks for your advice. I'll try again!

No problem!