chrislit/usfm2osis

usfm2osis.py can fail on non-BMP text

chrislit opened this issue · 0 comments

Cf. http://www.crosswire.org/tracker/browse/MODTOOLS-34

UCS-2 is the default internal representation of Unicode on Python, so non-BMP characters in the input may cause problems. Print a warning.
Use the following to check for UCS-4 vs. UCS-2 in compiled interpreter:
import sys
sys.maxunicode > 65536 and 'UCS4' or 'UCS2'