gnsync has problem with mutated vowels
niklassemmler opened this issue · 0 comments
niklassemmler commented
geeknote version: 2.0.15
Hi there,
I tried uploading a bunch of documents that contained mutated vowels (Umlaute in German) and these did not end up in evernote.
The problem is that gnsync first converts the file into ascii:
@log
def _get_file_content(self, path):
"""
Get file content.
"""
with codecs.open(path, 'r', encoding='utf-8') as f:
content = f.read()
# strip unprintable characters
content = content.encode('ascii', errors='xmlcharrefreplace') # <--- HERE!
content = Editor.textToENML(content=content, raise_ex=True, format=self.format)
And then converts it back to unicode
@staticmethod
def textToENML(content, raise_ex=False, format='markdown', rawmd=False):
"""
Transform formatted text to ENML
"""
if not isinstance(content, str): # <--- does not allow unicode
content = "" # <--- same
try:
content = unicode(content, "utf-8") # <--- breaks mutated vowels
# add 2 space before new line in paragraph for creating br tags
content = re.sub(r'([^\r\n])([\r\n])([^\r\n])', r'\1 \n\3', content)
# content = re.sub(r'\r\n', '\n', content)
Commenting these lines out solved the issue for me, but I did not dig deeper so there might be other problems now.
Cheers,
Niklas