lmu-bioinformatics/xmlpipedb

Processing Doesn't Occur with new Obo-xml files

Closed this issue · 4 comments

After fixing the issue with the Obo-xml files (issue #34), I found that GO processing no longer occurs. It ends nearly instantly, whereas in my past experience, it took 3-4 minutes.

err3

Oddly enough, the Go Processing throws an error on a Windows 10 machine, but does not throw an error on a Windows 7 machine. I'm not fully sure why one is erroring but not the other. The errors listed below are thrown during the import of the obo-xml file and appear to be related to missing tables.
err1
err2

I've tried several things to repair this issue, namely editing gmbuilder.sql to include the missing table referenced and was ultimately not successful.

I think it may be beneficial to create some form of documentation on how to update the obo-xml processor outside of just using xsd2db and the godbpostprocessor tool. I have a feeling that this issue stems from me not really knowing what steps to take, as I'm still not sure if what I did in Issue #34 was correct procedure.

dondi commented

@NAnguiano and I looked at this on Thursday afternoon and it was noted that the godb.sql file that was generated by xsd2db based on the new DTD that she modified was not integrated into the gmbuilder.sql file that initializes GenMAPP Builder’s database. Thus, the HoldsOverChain table which was added by the new DTD never made it to the database.

I walked @NAnguiano through this process and she will start with a fresh database using the newly constituted SQL file when she next gets a chance.

Processing is occurring now. Yay! No errors are thrown when the obo-xml file is imported.

However, when the processing begins, the following error is thrown
processingerr

Processing completes successfully even with that error, but I feel it's significant to note just in case.

dondi commented

Yes, this error is actually normal. The geneontologystage table is an intermediate table that is created during GO processing. There is code that first tries to delete a pre-existing geneontologystage table before running the processing routine, and on a “fresh” database that has never been processed, that table does not exist. The error you are seeing is the failure of the deletion, and is totally expected for a freshly-initialized database.

I have periodically looked at ways to squelch this exception because it is actually quite normal. However, my recollection of the last time I visited this code was that the exception-handling structure needed some reorganization—the place where this error is reported is the same place where other legitimate errors need to be announced to the user. The code reorganization effort required to isolate the capturing of just this exception has so far not matched its priority.

So I am inclined to close this issue pending this clarification, but per our process I’ll mark it as review requested first before doing so in case anyone has further ideas regarding the issue.

I've tagged this with our "backlog" tag so we can resurrect it sometime in the future. I think that if we document this issue, then we'll know to expect it when the processor gets run again in the future. I've put a comment on Issue #42.