new get_primitive_cell() is not working anymore in some cases
ahmedabdelkawy opened this issue ยท 7 comments
Before migration to structuretoolkit get primitive cell had this form:
def get_primitive_cell_old(
structure, standardize=False, use_elements=None, use_magmoms=None
):
ret = spglib.standardize_cell(
get_spglib_cell(structure, use_elements=use_elements, use_magmoms=use_magmoms),
to_primitive=not standardize,
)
if ret is None:
raise SymmetryError(spglib.spglib.spglib_error.message)
cell, positions, indices = ret
positions = (cell.T @ positions.T).T
new_structure = structure.copy()
new_structure.cell = cell
new_structure.indices[: len(indices)] = indices
new_structure = new_structure[: len(indices)]
new_structure.positions = positions
return new_structure`
Now it has this form:
def get_primitive_cell(structure, standardize=False, use_elements=None, use_magmoms=None):
ret = spglib.standardize_cell(
get_spglib_cell(structure, use_elements=use_elements, use_magmoms=use_magmoms),
to_primitive=not standardize,)
if ret is None:
raise SymmetryError(spglib.spglib.spglib_error.message)
cell, positions, indices = ret
positions = (cell.T @ positions.T).T
new_structure = structure.copy()
new_structure.cell = cell
new_structure = new_structure[: len(indices)] #-> here is the problem
new_structure = set_indices(structure=new_structure, indices=indices)
new_structure.positions = positions
return new_structure
Now the difference and the problem in the new_structure = set_indices(structure=new_structure, indices=indices) (new)
vs new_structure.indices[: len(indices)] = indices (old).
I will try to explain what is the problem:
Now if the structure objects is composed of one atom type, the function will work, lets say:
Fe
Fe
Fe
Fe
if you have two types but they are arranged in alternating fashion the function will also work:, lets say:
Fe
O
Fe
O
The problem will start arising for structures (imported from cif files) when you have two types let's say Fe and O that are not equal in number and arranged in a block fashion and not alternating. For example, in my case, In the conventional cell, i have 12 Fe folllowed by 18 O atoms in the following fashion:
Fe
Fe...12x
O
O... 18x
and I know that the primitive cell will be made of 4 Fe atoms followed by 6 Oxygen atoms.
I explain the problem here on that example from the code:
ret = spglib.standardize_cell(
get_spglib_cell(structure, use_elements=use_elements, use_magmoms=use_magmoms),
to_primitive=not standardize,)
cell, positions, indices = ret #indices will be 0 0 0 0 1 1 1 1 1 1 -> len = 10
positions = (cell.T @ positions.T).T
new_structure = structure.copy() # new structure at this moment will be Fe12O18
new_structure.cell = cell # copying new cell vectors
new_structure = new_structure[: len(indices)] #-> here is the problem # new structures is now Fe10
new_structure = set_indices(structure=new_structure, indices=indices) # you pass Fe10 and indices = 0 0 0 0 1 1 1 1 1 1 set indices will break down
new_structure.positions = positions
return new_structure
The problem in this case is that you are passing a structure of one species type (Fe10) and two indices with two types to set indices, which confuses set_indices() function.
def set_indices(structure, indices):
indices_dict = {
v: k for k, v in get_species_indices_dict(structure=structure).items()
}
structure.symbols = [indices_dict[i] for i in indices]
return structure
The return of set_indices will be {0: 'Fe'} in this case
Here is a code to reproduce the problem and attached in the cif file (just change the format from .txt to .cif (github wont upload cif files))
from pyiron import Project
from pyiron.atomistics.structure.atoms import ase_to_pyiron
import ase
pr = Project('test')
filename_fe2o3 = 'fe2o3_conv_strucs.9000139.cif'
hex_structure_fe2o3 = ase_to_pyiron(ase.io.read(filename_fe2o3,format='cif'));
hex_structure_fe2o3_sym = hex_structure_fe2o3.get_symmetry()
prim_structure_fe2o3 = hex_structure_fe2o3_sym.get_primitive_cell()
- Proposed solutions:
- Get back to the previous way of setting indices (the quick fix)
- The function will have to be rewritten as it is very case sensitive
I can try to make a pull request by reverting back to the old code, otherwise, I am not sure how the code should look like.
- Get back to the previous way of setting indices (the quick fix)
Unfortunately, this is not possible as the ASE atoms class does not have indices
defined as a property. So I guess we have to rewrite the functions.
We dont define the indices through the ASE atoms class. We define it subsequently. The problem is how we define it: doing it explicitly: new_structure.indices[: len(indices)] = indices (indices we take from ase) (working) vs. doing it with structuretoolkit.common.helper.set_indices(). I tested by changing the code locally as follows:
+ new_structure.indices[: len(indices)] = indices
new_structure = new_structure[: len(indices)]
- new_structure = structuretoolkit.common.helper.set_indices(
- structure=new_structure, indices=indices
- )
+ #new_structure = structuretoolkit.common.helper.set_indices(
+ # structure=new_structure, indices=indices
+ #)
and it is working.
Yes, the pyiron Atoms class has indices as defined in https://github.com/pyiron/pyiron_atomistics/blob/main/pyiron_atomistics/atomistics/structure/atoms.py#L208 but the structuretoolkit is purely based on the ASE Atoms class. So for the functionality to be usable with the ASE Atoms class we have to rewrite the function.
I was able to fix the structuretoolkit command with pyiron/structuretoolkit#57 now you can use:
import ase
import structuretoolkit as stk
filename_fe2o3 = 'fe2o3_conv_strucs.9000139.cif'
stk.analyse.get_symmetry(structure=ase.io.read(filename_fe2o3,format='cif')).get_primitive_cell()
This returns the right structure:
Atoms(symbols='Fe4O6', pbc=True, cell=[[2.519, 1.4543453280886673, 4.590666666666666], [-2.519, 1.4543453280886673, 4.590666666666666], [0.0, -2.9086906561773347, 4.590666666666666]], spacegroup_kinds=...)
Unfortunately, there is still something wrong with the compatibility of the pyiron Atoms class and the ASE Atoms class. My assumption is that this is also related to #981 as discussed in https://github.com/orgs/pyiron/discussions/191
In combination with pyiron/structuretoolkit#58 the example can be written even more elegantly:
import structuretoolkit as stk
import ase
filename_fe2o3 = 'fe2o3_conv_strucs.9000139.cif'
stk.analyse.get_primitive_cell(structure=ase.io.read(filename_fe2o3,format='cif'))
Thank you very much, Jan, for your quick help.
I tested pulling changes from pyiron/structuretoolkit#57 and it is working if I do this:
import ase
import ase_to_pyiron
import structuretoolkit as stk
filename_fe2o3 = 'fe2o3_conv_strucs.9000139.cif'
struc_conv = ase.io.read(filename_fe2o3,format='cif')
struc_prim = ase_to_pyiron(stk.analyse.get_symmetry(struc_conv).get_primitive_cell())
struc.get_chemical_formula()
-> Fe4O6
However, weirdly, when I do this:
filename_fe2o3 = 'fe2o3_conv_strucs.9000139.cif'
hex_structure_fe2o3 = ase_to_pyiron(ase.io.read(filename_fe2o3,format='cif'));
hex_structure_fe2o3_sym = hex_structure_fe2o3.get_symmetry()
prim_structure_fe2o3 = hex_structure_fe2o3_sym.get_primitive_cell()
prim_structure_fe2o3.get_chemical_formula()
-> Fe10
Positions and lattice vectors are correct. It is only the symbols that are wrong. Anyway, I can work with it as it is for the moment. Thanks, again!
I haven't read the full discussion yet, but I have used this snippet in the past
import spglib
from ase import Atoms
cell, scaled_positions, numbers = spglib.standardize_cell(structure)
structure = Atoms(scaled_positions=scaled_positions, cell=cell, numbers=numbers, pbc=structure.pbc)
structure = pr.create.structure.from_ase(structure)
without issue, though I suppose you'll have to add any custom tag arrays defined on the original structure manually. That should be straightforward by looping through structure.arrays
, though.