Character encoding problem
thomaz-yuji opened this issue · 3 comments
While trying to use interbase python package, i had some errors related to character encoding, probably from latin1, the error below is just one from lots of more:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc7 in position 23: invalid continuation byte
0xC7 represents the Ç character (capital letter C with cedilla).
Also, some other errors:
File "C:\dev\.venv\Lib\site-packages\pandas\io\sql.py", line 2079, in read_query columns = [col_desc[0] for col_desc in cursor.description] ^^^^^^^^^^^^^^^^^^ File "C:\dev\.venv\Lib\site-packages\interbase\ibcore.py", line 3290, in __get_description return self._ps.description ^^^^^^^^^^^^^^^^^^^^ File "C:\dev\.venv\Lib\site-packages\interbase\ibcore.py", line 2202, in __get_description precision = (self.cursor._connection._determine_field_precision(sqlvar)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\dev\.venv\Lib\site-packages\interbase\ibcore.py", line 1096, in _determine_field_precision self.__ic.execute("SELECT FIELD_SPEC.RDB$FIELD_PRECISION" File "C:\dev\.venv\Lib\site-packages\interbase\ibcore.py", line 3411, in execute self._ps._execute(parameters) File "C:\dev\.venv\Lib\site-packages\interbase\ibcore.py", line 3130, in _execute raise exception_from_status(DatabaseError, self._isc_status, interbase.ibcore.DatabaseError: ("Error while executing SQL statement:\n- SQLCODE: -804\n- b'Dynamic SQL Error'\n- b'SQL error code = -804'\n- b'Incorrect values within SQLDA structure'", -804, 335544569) Exception ignored in: <function Connection.__del__ at 0x00000181BCEB3380> Traceback (most recent call last): File "C:\dev\.venv\Lib\site-packages\interbase\ibcore.py", line 1638, in __del__ self.__close() File "C:\dev\.venv\Lib\site-packages\interbase\ibcore.py", line 975, in __close self.__ic.close() File "C:\dev\.venv\Lib\site-packages\interbase\ibcore.py", line 3376, in close self._ps.close() File "C:\dev\.venv\Lib\site-packages\interbase\ibcore.py", line 3205, in close self._free_handle() File "C:\dev\.venv\Lib\site-packages\interbase\ibcore.py", line 3061, in _free_handle raise exception_from_status(DatabaseError, self._isc_status, interbase.ibcore.DatabaseError: ("Error while releasing SQL statement handle:\n- SQLCODE: -501\n- b'Dynamic SQL Error'\n- b'SQL error code = -501'\n- b'Attempt to reclose a closed cursor'", -501, 335544569) Exception ignored in: <function PreparedStatement.__del__ at 0x00000181BCEBD440>
That one is problably because of incompatibility of interbase and read_sql from pandas?
My code was based on pyodbc and now i'm trying to change to interbase.
Any hints?
I have a similar issue. If I set the charset of the connection to "ISO8859_1", which is correct in my case
self.conn = interbase.connect(
host=host,
user=username,
password=passwd,
charset="ISO8859_1",
database="c:/dbs/mydb.ib",
ib_library_name="/opt/interbase/lib/libgds.so"
)I get:
interbase.ibcore.DatabaseError: ("Cursor.fetchone:\n- SQLCODE: -802\n- b'arithmetic exception, numeric overflow, or string truncation'\n- b'Cannot transliterate character between character sets'", -802, 335544321)If I set the charset to None, it fails on line 365 in ibcore.py:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe4 in position 18: invalid continuation bytebecause the str is encoded in ISO5589_1.
For now I have to use following workaround:
self.conn = interbase.connect(
host=host,
user=username,
password=passwd,
charset="ISO8859_1",
database="c:/dbs/mydb.ib",
ib_library_name="/opt/interbase/lib/libgds.so"
)
# monkey patch interbase to return bytes instead of strings
def b2u(st, charset):
"Decode to unicode if charset is defined. For conversion of result set data."
return st
interbase.ibcore.b2u = b2uand decode the column values on my side.
Hey 👋 @lmbelo, maybe you can help with this?
Thanks