kahst/BirdCLEF2017

About sort data

igo312 opened this issue · 2 comments

Hello kahst.
I met the problem about 'gbk' codec can't decode bytes in position 19566-19567: illegal multibyte sequence when I read the xml file .Do you get the error too.Or just I use the Chinese to do comment and cause the error ?
Thank you

kahst commented

To me, this sounds indeed like some unicode confusion. I did not experience that kind of error when I processed the xml data. Have you tried to remove the comment? Did it work?

@kahst thank you for answer.I did not try remove the comment , But I set encoding to utf-8, it works.
And I found out , in the xml file that it had already said it's utf-8 encoding