atomistic-machine-learning/schnetpack

Response properties: error in tensors' size

Closed this issue · 15 comments

Dear developer,
Many thanks for developing this new version. I'm currently trying to replicate the results of the FIeldSchNet paper, but when I run spktrain with the response experiment, I obtain the following error:

v = torch.sum(mu[field] * external_fields[field], dim=1, keepdim=True)
RuntimeError: The size of tensor a (3) must match the size of tensor b (90) at non-singleton dimension 1.

I'm using the dataset ethanol_continuum_evw from the FIeldSchNet paper and I added the dictionary of units as requested. Moreover, it worked with the previous version.
Could you please help me understand what is going on?

Hey @MathiasHilfiker ,
sorry for the late reply! I am not sure what the exact problem is, but it could just be related to the shapes in your database.
Could you print out the shapes of mu[field], external_fields[field] and field? Could you also add the shapes for all properties in the inputs batch?

Hello, thank you for your reply,
This is the output for external field and mu:

Schermata 2023-06-15 alle 10 05 12

An this is the shape of the properties for one example molecule:

Schermata 2023-06-15 alle 10 13 45

The fact that mu and external_field[field] have incompatible sizes seems to come from schnetpack/representation/field_schnet.py where in line 380 the input is unsqueezed while in line 397 mu is defined as:
mu = { field: torch.zeros((qs[0], 3, qs[2]), device=q.device) for field in self.external_fields }

Thanks for the shapes!

The shape of mu seems right to me, but I would expect external_fields to have a different shape. Something like [90, 3, 1] instead of [90, 1].
I assume you made some changes to the config file, since there is no dataset provided for this script. Also the external_fields=[] default value is empty in the config. Did you add anything here? If so: what kind of external field did you add.

I have no experience with field-schnet, so I also need some time to figure out what is wrong.

Is it true that there is no dataset provided for this script, so I took the one from the original FIeldSchNet paper: http://quantum-machine.org/datasets/ ("Molecular response properties in implicit and explicit solvent environments" section, ethanol_continuum_evw.tgz file). The only modification I did was to add the dictionary of units of measurement (I put '1' for every unit). Then I'm running the response experiment as it is, with the command:
spktrain experiment=response data.datapath=path_to_datataset data.num_train=900 data.num_val=100
From what I understood, FieldSchNet creates the external field with just the knowledge of the dielectric constant, so I didn't add anything to external_fields

jnsLs commented

Hi @MathiasHilfiker ,
we looked into this again, and it turned out the dimensions of the old dataset do not fit the design of schnetpack2.0.
Therefore, we adapted the spkconvert.py script (can be found in the lastest commit of the master branch). Now you can specify an additional argument --expand_property_dims.

If you download the old dataset again and run

spkconvert path/to/dataset --distunit Bohr --propunit energy:1,forces:1,dipole_moment:1,polarizability:1,shielding:1,dielectric_constant:1,electric_field:1,magnetic_field:1 --expand_property_dims dipole_moment polarizability electric_field magnetic_field

it should work.

Please let us know if this solves your problem. Best regards,
Jonas

@jnsLs I've tried to train a model has suggest here

spkconvert ethanol_continuum_evw.db --distunit Bohr --propunit energy:1,forces:1,dipole_moment:1,polarizability:1,shielding:1,dielectric_constant:1,electric_field:1,magnetic_field:1 --expand_property_dims dipole_moment polarizability electric_field magnetic_field
spktrain experiment=response data.datapath=ethanol_continuum_evw.db data.num_train=900 data.num_val=100

When I try to run an MD to predict the IR spectrum with the following code, adapted from tutorials

model_path = 'best_model'
mol_path = 'md_ethanol.xyz'

molecule = read(mol_path)
n_replicas = 1

md_system = System()
md_system.load_molecules(
    molecule,
    n_replicas,
    position_unit_input="Bohr"
)

temperature = 300

md_initializer = UniformInit(
    temperature,
    remove_center_of_mass=True,
    remove_translation=True,
    remove_rotation=True,
)

md_initializer.initialize_system(md_system)

time_step = 0.5
md_integrator = VelocityVerlet(time_step)

cutoff = 5.0
cutoff_shell = 2.0

md_neighborlist = NeighborListMD(
    cutoff,
    cutoff_shell,
    ASENeighborList,
)

md_calculator = SchNetPackCalculator(
    model_path,
    "forces",
    "kcal/mol",
    "Bohr", 
    md_neighborlist,
    energy_key="energy",  
    required_properties=[], 
)

md_simulator = Simulator(
    md_system,
    md_integrator,
    md_calculator
)

if torch.cuda.is_available():
    device = torch.device('cuda')
else:
    device = torch.device('cpu')

md_precision = torch.float32

bath_temperature = 300
time_constant = 100

langevin = LangevinThermostat(bath_temperature, time_constant)

simulation_hooks = [
    langevin
]

log_file = os.path.join("long_md.hdf5")

buffer_size = 100

data_streams = [
    callback_hooks.MoleculeStream(store_velocities=True),
    callback_hooks.PropertyStream(target_properties=[properties.energy]),
]

file_logger = callback_hooks.FileLogger(
    log_file,
    buffer_size,
    data_streams=data_streams,
    every_n_steps=1,
    precision=32,
)

simulation_hooks.append(file_logger)

chk_file = os.path.join('simulation.chk')
checkpoint = callback_hooks.Checkpoint(chk_file, every_n_steps=100)
simulation_hooks.append(checkpoint)

md_simulator = Simulator(md_system, md_integrator, md_calculator, simulator_hooks=simulation_hooks)

md_simulator = md_simulator.to(md_precision)
md_simulator = md_simulator.to(device)

n_steps = 20000

md_simulator.simulate(n_steps)

data = HDF5Loader(log_file)
md_atoms = data.convert_to_atoms()

write('md_ethanol.traj', md_atoms, format='traj')

spectrum = PowerSpectrum(data, resolution=2048)
spectrum.compute_spectrum(molecule_idx=0)

frequencies, intensities = spectrum.get_spectrum()

plt.figure()
plt.plot(frequencies, intensities)
plt.xlim(0,4000)
plt.ylim(0,100)
plt.ylabel('I [a.u.]')
plt.xlabel('$\omega$ [cm$^{-1}$]')
plt.show()
plt.savefig('ir.png')

I essentially get a decaying line. I believe this is related to the MD, as it is exploding the structure. I'm attaching the trajectory. Am I missing something?

md_ethanol.zip

Dear Manuel,
it looks like there is something wrong with your units.
The md_ethanol.xyz file should have distance units Angström. You specified Bohr:

md_system.load_molecules(
    molecule,
    n_replicas,
    position_unit_input="Bohr"
)

Furthermore you should check if the units of your model (specified in the hydra config files (response.yaml and "parent" config files) fit the units you specify in the calculator

md_calculator = SchNetPackCalculator(
    model_path,
    "forces",
    "kcal/mol",
    "Bohr", 
    md_neighborlist,
    energy_key="energy",  
    required_properties=[], 
)

Best,
Jonas

dear @jnsLs thanks for the answer, I indeed noticed that I've defined the units of the molecule wrong and changed it, but even in Angstrom the run explodes.
Could you clarify how I can check the units of the model? I've used Bohr because you suggested it in a comment above, I'm using the database ethanol_continuum_evw.db from the website http://quantum-machine.org/datasets/

The directory where your best_model is stored, there is a file called config.yaml.
The latter should contain info about your dataset module. Here an example how it could look like:

data:
  _target_: schnetpack.datasets.QM9
  datapath: ./qm9.db
  data_workdir: null
  batch_size: 100
  num_train: 1001
  num_val: 100
  num_test: null
  num_workers: 8
  num_val_workers: null
  num_test_workers: null
  train_sampler_cls: schnetpack.data.sampler.StratifiedSampler
  remove_uncharacterized: true
  distance_unit: Ang
  property_units:
    energy_U0: eV
    energy_U: eV
    enthalpy_H: eV
    free_energy: eV

Your model is trained on those units and you should specify the same units in your calculator.

@jnsLs mine looks something like this

data:
  _target_: schnetpack.data.AtomsDataModule
  datapath: ethanol_continuum_evw.db
  data_workdir: null
  batch_size: 10
  num_train: 900
  num_val: 100
  num_test: null
  num_workers: 8
  num_val_workers: null
  num_test_workers: null
  distance_unit: 1.0
  transforms:
  - _target_: schnetpack.transform.SubtractCenterOfMass
  - _target_: schnetpack.transform.RemoveOffsets
    property: ${globals.energy_key}
    remove_mean: true
  - _target_: schnetpack.transform.MatScipyNeighborList
    cutoff: ${globals.cutoff}
  - _target_: schnetpack.transform.CastTo32
  - _target_: schnetpack.transform.SplitShielding
    shielding_key: ${globals.shielding_key}
    atomic_numbers: ${globals.shielding_elements}
logger:
  tensorboard:
    _target_: pytorch_lightning.loggers.tensorboard.TensorBoardLogger
    save_dir: tensorboard/
    name: default
print_config: true

This means that there is something wrong with the

spkconvert ethanol_continuum_evw.db --distunit Bohr --propunit energy:1,forces:1,dipole_moment:1,polarizability:1,shielding:1,dielectric_constant:1,electric_field:1,magnetic_field:1 --expand_property_dims dipole_moment polarizability electric_field magnetic_field
spktrain experiment=response data.datapath=ethanol_continuum_evw.db data.num_train=900 data.num_val=100

Imho there is no indication that there is a problem with spkconvert or spktrain.

Your config file shows that you are not converting the units for training. So your model is trained on the units of the dataset, which are atomic units (see documentation http://quantum-machine.org/datasets/).

Hence in your calculator you also need to specify atomic units.

@jnsLs I tried to change it to atomic units, and I keep having not infrared and the trajectory explodes. Am I missing something or doing something wrong?

coff = 5.0
coff_shell = 2.0

md_neighbor = NeighborListMD(
    coff,
    coff_shell,
    ASENeighborList,
)

md_calculator = SchNetPackCalculator(
    model_path,
    force_key     = 'forces',
    energy_unit   = "Hartree",
    position_unit = "Bohr",
    required_properties=['energy', 'dipole_moment', 'polarizability'],
    neighbor_list=md_neighbor
)

Hi @manuelbenavent,
I don't think there is an issue with the code here, but rather an issue with inconsistent units somewhere.
This is not something that we can fix for you, but I suggest you try the following:

  1. Try to reproduce the MD tutorial and see if that works fine.
  2. Delete the database and model. Then start again from scratch.
  3. Try to keep it simple and only use Ang and eV. (database, model and calculator).
  4. Before you train a model, do some sanity checks. E.g. read some sample structures from the database and try to plot them with ase.visualize.view and also compare the interatomic distances to see if they are in the right range for the selected distance unit.
  5. If all of this works fine: try to use other units and see if that works.

I hope this still helps.
Best,
Stefaan

Dear @Stefaanhess
I've tried to perform MD on the md_ethanol.model in the test folder and I can get the IR spectrum. However, I do not have the database used for training this model. I've tried to use the model available at http://quantum-machine.org/datasets/ and converted to the new version of schnetpack using the code provided above

spkconvert ethanol_continuum_evw.db --distunit Bohr --propunit energy:1,forces:1,dipole_moment:1,polarizability:1,shielding:1,dielectric_constant:1,electric_field:1,magnetic_field:1 --expand_property_dims dipole_moment polarizability electric_field magnetic_field

I would like to convert everything to eV and Ang but I don't see that option on the spkconvert, if I understand it correctly this is just to convert from the old schnetpack to the newer version.

I tried to construct a script to train inspired on the tutorials however when I do that, I get the error that AttributeError: module 'ase.units' has no attribute '1'

Here is the code I'm using for training

import torch
import torchmetrics
import schnetpack as spk
import schnetpack.transform as trn
import pytorch_lightning as pl
from pytorch_lightning.callbacks import LearningRateMonitor,EarlyStopping
import os
import time


forcetut = './forcetut'
if not os.path.exists(forcetut):
    os.makedirs(forcetut)

if torch.cuda.is_available():
  acceler = "gpu"
  device  = "cuda"
  num_dev = torch.cuda.device_count()
else:
  acceler = "cpu"
  device  = "cpu"
  num_dev = 2

data = spk.data.AtomsDataModule(
    datapath='ethanol_continuum_evw.db',
    batch_size=20,
    distance_unit='Bohr',
    property_units={'energy':'Ha', 'forces':'Ha/Bohr'},
    num_train=20000,
    num_val=5000,
    num_test = None,
    split_file='split.npz',
    load_properties=['energy', 'forces', 'dipole_moment', 'polarizability'],
    val_batch_size=20,
    test_batch_size=1,
    transforms=[
        trn.ASENeighborList(cutoff=9.448),
        trn.RemoveOffsets("energy", remove_mean=True, remove_atomrefs=False),
        trn.CastTo32()
    ],
    num_workers=1,
)
data.prepare_data()
data.setup() 

cutoff = 9.448
n_atom_basis = 128

pairwise_distance = spk.atomistic.PairwiseDistances()
external_fields = spk.atomistic.StaticExternalFields()
radial_basis = spk.nn.GaussianRBF(n_rbf=20, cutoff=cutoff)

fieldschnet = spk.representation.FieldSchNet(
   n_atom_basis=n_atom_basis,
   n_interactions=3,
   radial_basis=radial_basis,
   response_properties=['forces', 'dipole_moment', 'polarizability'],
   cutoff_fn=spk.nn.CosineCutoff(cutoff)
)

pred_nnenergy = spk.atomistic.Atomwise(n_in=n_atom_basis, 
                                       output_key='nn_energy')
pred_repulsion = spk.atomistic.ZBLRepulsionEnergy(energy_unit   = 'Ha',
                                                   position_unit = 'Bohr',
                                                   output_key    = 'repulsion',
                                                   trainable     = True,
                                                   cutoff_fn     = spk.nn.MollifierCutoff(cutoff))
pred_forces = spk.atomistic.Forces(energy_key='energy', 
                                   force_key='forces', 
                                   stress_key='stress', 
                                   calc_stress=True)
pred_energy = spk.atomistic.Aggregation(keys = ['nn_energy','repulsion'],
                                        output_key = "energy")
pred_dipole = spk.atomistic.DipoleMoment(n_in=n_atom_basis,
                                         dipole_key='dipole_moment')
pred_polar = spk.atomistic.Polarizability(n_in=n_atom_basis,
                                          polar_key='polarizability')

nnpot = spk.model.NeuralNetworkPotential(
    representation=fieldschnet,
    input_modules=[pairwise_distance, external_fields],
    output_modules=[pred_energy, pred_forces, pred_dipole, pred_polar],
)


output_energy = spk.task.ModelOutput(
    name='energy',
    loss_fn=torch.nn.MSELoss(),
    loss_weight=0.99,
    metrics={
        "MAE": torchmetrics.MeanAbsoluteError()
    }
)

output_forces = spk.task.ModelOutput(
    name='forces',
    loss_fn=torch.nn.MSELoss(),
    loss_weight=0.01,
    metrics={
        "MAE": torchmetrics.MeanAbsoluteError()
    }
)

output_dipole = spk.task.ModelOutput(
    name='dipole_moment',
    loss_fn=torch.nn.MSELoss(),
    loss_weight=0.99,
    metrics={
        "MAE": torchmetrics.MeanAbsoluteError()
    }
)

output_polar = spk.task.ModelOutput(
    name='polarizability',
    loss_fn=torch.nn.MSELoss(),
    loss_weight=0.99,
    metrics={
        "MAE": torchmetrics.MeanAbsoluteError()
    }
)

task = spk.task.AtomisticTask(
   model=nnpot,
    outputs=[output_energy, output_forces, output_dipole, output_polar],
    optimizer_cls=torch.optim.AdamW,
    optimizer_args={"lr": 1e-4, "weight_decay": 1e-2},
    scheduler_args = {'mode': 'min',
                      'factor': 0.75,
                      'patience': 15,
                      'min_lr': 1E-6,
                      'smoothing_factor': 0.9},
    scheduler_monitor = "val_loss")


logger = pl.loggers.TensorBoardLogger(save_dir=forcetut)
callbacks = [
    spk.train.ModelCheckpoint(
        model_path=os.path.join(forcetut, "fieldschnet.model"),
        save_top_k=1,
        monitor="val_loss"),
    LearningRateMonitor(),
    EarlyStopping(monitor="val_loss",
                  mode="min",
                  patience=30)]

trainer = pl.Trainer(
    devices=num_dev,
    accelerator=acceler,
    callbacks=callbacks,
    logger=logger,
    default_root_dir=forcetut,
    max_epochs=1000,
)

This error probably occurs because you are not specifying property units for all the properties you are loading in spk.data.AtomsDataModule.

To avoid such problems it might be favorable to work with our predefined config files and the CLI.
You could for example train a field schnet model using the following command:

spktrain experiment=response data.datapath=path/to/ethanol_continuum_evw.db data.num_train=500 data.num_val=100