CHAR(X) fields have invalid length if using UTF8 charset and collations [ODBC184]
firebird-automations opened this issue · 19 comments
Submitted by: Arthur Hoornweg (bobokonijn)
Attachments:
utf8.jpg
odbc_query_test_tool.png
database.rar
OdbcFb.zip
Firebird_ODBC_2.0.4.155_Win32.exe
My 32-bit Delphi application uses ADO (with the microsoft OleDB provider for ODBC) to access a Firebird Embedded database. The Firebird Embedded DLL has version 2.5.3.26780. The database has default character set UTF8.
I noticed that the database is case sensitive. This is undesirable for my application, therefore I re-created the database with character set UTF8, default collation unicode_ci to make it case-insensitive.
Using this new collation, I observe a strange phenomenon.
I have a table that contains a field declared as CHAR(14). When I query the table in my application, the returned strings have length 56 !!! The first 14 characters are OK, the rest is padded with blanks. This does not happen with the default UTF8 collation, only with the case-insensitive ones.
If I query the same table with FlameRobin (which doesn't use ODBC), the field is perfectly OK no matter what collation I use. So I suspect that the bug is in the ODBC driver and not in Firebird itself.
I would be very grateful for a fix, it is a real show-stopper for me.
Commented by: @alexpotapchenko
I can not reproduce it via IBExpert (ODBC viewer) or MS Access (ODBC link), maybe problem in ADO.
It seems that length 56 is normal if you use UTF8 charset in connection settings, see my attached screenshot in IBExpert.
Modified by: @alexpotapchenko
Attachment: utf8.jpg [ 12673 ]
Commented by: Arthur Hoornweg (bobokonijn)
Steps to reproduce:
- Create a new, completely empty firebird embedded database with default character set UTF8.
- Execute the following queries:
CREATE TABLE one(
akey varchar(30) character set ASCII not null primary key,
avalue1 char(14),
avalue2 char(14) CHARACTER SET UTF8 COLLATE UNICODE_CI)'
Note that fields avalue1 and avalue2 are both going to be UTF8 but only avalue2 is case-insensitive.
Now execute:
insert into one (akey,avalue1,avalue2) VALUES ('test','12345678901234','ABCDEFGHIJKLMN')
For evaluating the bug, you need an application that connects to the database using OleDB+ODBC.
Open this database and execute "SELECT (*) from one".
The result:
Field avalue1 is returned as a unicode string with a length of 14.
Field avalue2 is returned as a unicode string with a length of 56.
I can supply a working test case for this (database, Delphi source and compiled binary).
EDIT:
The ADO/OLEDB connection string I use is the following:
Provider=MSDASQL.1;Persist Security Info=False;Extended Properties="Driver={Firebird/Interbase(r) driver}; UID=SYSDBA; PWD=; DBNAME=d:\temp\mydatabase.fdb; CHARSET=UTF8; DIALECT=3"
Commented by: @alexpotapchenko
Well, it seems that this bug can reproduce with ADO only.
Attach your example please (but I don't have Delphi and it is better to reproduce it via standart software (Open Office and etc).
Can you use other charset (WIN1252 for example) in ADO/OLEDB connection settings?
Commented by: Arthur Hoornweg (bobokonijn)
Screenshot of the freeware "odbc query tool".
Please notice how the data fields have different column widths even though they're all declared as CHAR(14).
Modified by: Arthur Hoornweg (bobokonijn)
Attachment: odbc_query_test_tool.png [ 12675 ]
Commented by: Arthur Hoornweg (bobokonijn)
Please find attached the database file.
Modified by: Arthur Hoornweg (bobokonijn)
Attachment: database.rar [ 12676 ]
Commented by: Arthur Hoornweg (bobokonijn)
I think I can reproduce the issue with ODBC alone, no OleDB required.
Please see the attached screenshot and database file.
- First I've installed a 32-bit freeware called "odbc query tool" on my PC.
- Then I've copied the Firebird Embedded drivers (I re-named fbclient.dll into gds32.dll) into its installation directory... just to make sure that ODBC will load the correct drivers and not something else.
- Then I've created a 32-bit ODBC DSN which connects to the aforementioned Firebird database. I did it by executing c:\windows\syswow64\odbccad32.exe. I set the character set of the connection to UTF8.
- I now run ODBC query tool end open the database. It connects successfully.
- It shows me that both columns avalue1 and avalue2 in table "ONE" are declared as CHAR(14). So far, so good.
- Then I performed a " SELECT (akey,avalue1,avalue2,avalue1,avalue2,avalue1,avalue2) FROM ONE ".
- The result is displayed in a grid at the bottom of the page and an anomaly shows up:
- As you can see in the screen shot, the columns displaying avalue1/avalue2 have vastly DIFFERENT WIDTHS.
- If I double-click field "avalue1" to edit it, the cursor is placed at character pos 15.
- If I double-click field "avalue2" to edit it, the cursor is placed at character pos 57.
So it appears that ODBC somehow believes that the two columns have different widths. Something is really wrong here and it has nothing to do with ADO/OLEDB.
Commented by: @alexpotapchenko
Please test attached driver with fix.
Modified by: @alexpotapchenko
Attachment: OdbcFb.zip [ 12677 ]
Commented by: Arthur Hoornweg (bobokonijn)
It works now, as far as I can see! Thanks!
Can we expect a new installer with updated build number ?
Commented by: @alexpotapchenko
The new official release will be not in the near future. If you need installer with this fix I can build it and attach here.
Modified by: @alexpotapchenko
status: Open [ 1 ] => Resolved [ 5 ]
resolution: Fixed [ 1 ]
Fix Version: 2.0.4 [ 10662 ]
Commented by: Arthur Hoornweg (bobokonijn)
That would be very nice!
Main thing, please update the build number so the original dll gets replaced upon installation :-)
Again, many thanks!
Modified by: @alexpotapchenko
summary: CHAR(X) fields have invalid length if using UTF8 collation unicode_ci => CHAR(X) fields have invalid length if using UTF8 charset and collations
Commented by: @alexpotapchenko
Win32 installer in attachment
Modified by: @alexpotapchenko
Attachment: Firebird_ODBC_2.0.4.155_Win32.exe [ 12678 ]
Modified by: @alexpotapchenko
status: Resolved [ 5 ] => Closed [ 6 ]