txie-93/cgcnn

Description for Materials Project .csv files

Opened this issue · 4 comments

Hi Tian,

Trying to figure out what each of the mp-ids-*.csv files correspond to.

After removing ill-converged crystals, the full database has 46744 materials covering 87 elements, 7 lattice systems, and 216 space groups.
The database [34] we use includes the energy above hull of 18928 perovskite crystals
Figures 2(b) and 2(c) show the performance of the two models on 9350 test crystals

Table 1 has # train data with values of 28046, 16458, and 2041.

So "46744" seems pretty straightforward mp-ids-46744.csv

But what do mp-ids-3402.csv and mp-ids-27430.csv correspond to in the paper?

Thanks!

Sterling

Hi Sterling,

They are for table 1. Different properties have different number of data points. The numbers should match the ones in Table 1.

Gotcha. From what I can tell the numbers don't match exactly. But for now I'm just going to be using the 46744 set. Thanks for all the help!

Opps, forgot to mention that the numbers listed in Table 1 are the training data sizes. Here the numbers are the total data sizes. I used 60% as training data. They should match after multiplying 60%.

Ah, perfect! Thank you!