txie-93/cgcnn

atom_init.json

felentovic opened this issue · 2 comments

I was inspecting atom_init.json file and I found something strange in comparison to the documentation of atom features in the table S2 (Supplemental Material).

  • Since every feature is one hot encoded, length of the vector should be 93 according to the table, but it is actually 92.

  • Also Group number feature has 18 categories in the table, but in the file it has 19. Actinide and lanthanide elements are presented by having 1 at element with index 0, except Lutetium which is labeled as group number 3. Other elements have an element equal to 1 on the same index as their group number, e.g. for hydrogen, element at index number 1 is 1.

  • Regarding the Period number feature it says it has 9 categories, but in the file it has 7 since lathanide and actinide elements are described with period 6 and 7 respectively and not 8 and 9 as stated in the table S2.

  • Could you also explain why hydrogen's electronegativity is in the bin [1.9 - 2.25) instead of [2.25-2.6) according to Sanderson electronegativity, i.e. 2.59.

I used the mendeleev package to generate the atom initialization vectors. You can find the definitions for each atom property from their website. The definitions in the table S2 come from an older version.

Actually, the atom initialization vectors are not so important when data size is large (Fig. S3), but they can be helpful when you only have ~10^3 data points. Feel free to experiment your own atom initialization vectors.

Hi, which version of the mendeleev package you have used to create the features.