rcsb/py-rcsb_db

Failed int cast for None in DataTransformFactory

Closed this issue · 1 comments

mcs07 commented

Using the MySQL loader, I am seeing an error that seems to be mostly harmless:

ERROR:rcsb.db.processors.DataTransformFactory:Failing for '5TA4' table PDBX_STRUCT_ASSEMBLY
atName oligomeric_count with int() argument must be a string, a bytes-like object or a 
number, not 'NoneType'

I think it occurs due to the separate 'deposited' assembly that the DictMethodRunnerHelper creates:

tObj = dataContainer.getObj('pdbx_struct_assembly')
rowIdx = tObj.getRowCount()
tObj.setValue('deposited', 'id', rowIdx)
tObj.setValue('deposited_coordinates', 'details', rowIdx)
logger.debug("Full row is %r" % tObj.getRow(rowIdx))

It seems like the problem is that oligomeric_count (and oligomeric_details) are not set on the new row, so the DataTransformFactory attempts to cast None to an integer. Would it be more correct to set oligomeric_count and oligomeric_details on the 'deposited' assembly to ? (or . or '')? Then they would be more correctly handled as null values by DataTransformFactory.

(I see there is also the dropEmpty flag that might fix this, but i don't know what the wider implications of setting that are.)

Thank you for the report. The oligomeric_count, oligomeric_details, and method_details are
properly initialized for the case of the generated deposited coordinates assembly. This
change has been introduced in V0.52.