byrokrat/giroapp

xml files are not identified as xml

Closed this issue · 7 comments

I just tested importing some xml files with the import command, and got these errors:

[2019-05-06 21:48:26] [INFO] Importing file medgivande-via-hemsida.xml ()
[2019-05-06 21:48:26] [ERROR] Invalid autogiro file: Parser: Syntax error, expecting "\r", ALPHA-NUMERIC, MSG1 on line 1 (code:210)

Here is an anonymized copy of the xml I tried to import:
https://hastebin.com/kipetiqiqu.xml
If you need the actual file, I can send it to you encrypted...

Side note: I'm not actually using this function, so there's no rush. I have a separate xml parser that I'm still using.

Thanx. Will look into it as time comes..

Digging into this, it's an error from SimpleXML. From XmlObject.php, row 44, I got a an error from SimpleXMLElement creator, "String could not be parsed as XML"

The file you posted actually had a few errors in it. Firstly it wasnt well formed as as the <DocumentElement> didn't have a closing tag, but I guess that was just a copy-error on your part.

Then there were multiple errors with unmatching tags.

<Betalarens_x0020_adress_1>donor address 1</Betalares_x0020_adress_1>

Is invalid as there is an extra n in the opening tag (Betalarens or Betalares). The correct thing is without the extra n. Not sure how this happened though, was it you editing to remove sensitive data?

Apart from these things what I could find was a different bug. On invalid content (for example an invalid payer number) the code would mark it as invalid XML (as in not well formed), assume that the file was not an xml-file at all and try to import it as a regular autogiro file, hence the strange error message. This bug is fixed in the referenced PR above. If the file contains invalid data (for example empty payer numbers) it will still be an error, but with a more reasonable error message...

Found some other issues in how xml-forms are handled that are not bugs, but should still be considered IMHO. Started a different issue to track these ideas.

Can you validate that your original file is importable, or at least results in an understandable error message, with the patched code from #186? In that case I'll try to close this and release alpha5 shortly...

Sorry for my delay, I seem to have lost my github notifications somehow.
Will look into this shortly, but I don't think I edited anything in the XML Tags, just the data.
But it's the most likely explanation, so that's probably what happened anyway

Yes, my mistakes. I must have mistyped when anonymizing the file. Also, the finishin tag was lost while I copied the file in the first place.

I do, now, get more informative error messages. However, every file import with an invalid file gives a full stack trace, which might not be perfect.
But yes, actual errors in the xml data is now presented.

Closing this as fixed.