esnme/ultramysql

Unicode tests failing

Closed this issue · 4 comments

Hi, I'm running the tests on the current head (05166cc) and the two unicode tests are failing. Can you reproduce it or am I doing something wrong?

The tests pass on my machine and database.
Could you give me a "DESCRIBE" of your test database's tables or any error
message you get?

//JT

Den 26 februari 2012 22:33 skrev George Sakkis <
reply@reply.github.com

:

Hi, I'm running the tests on the current head
(05166cc) and the two unicode tests are
failing. Can you reproduce it or am I doing something wrong?


Reply to this email directly or view it on GitHub:
#8

Jonas Trnstrm
Product Manager
e-mail: jonas.tarnstrom@esn.me
skype: full name "Jonas Trnstrm"
phone: +46 (0)734 231 552

ESN Social Software AB
www.esn.me

Sure:

    mysql> show create table tblutf\G
*************************** 1. row ***************************
       Table: tblutf
Create Table: CREATE TABLE `tblutf` (
  `test_id` int(11) DEFAULT NULL,
  `test_string` varchar(32) DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8
1 row in set (0.00 sec)

mysql> show create table tbltest\G
*************************** 1. row ***************************
       Table: tbltest
Create Table: CREATE TABLE `tbltest` (
  `test_id` int(11) DEFAULT NULL,
  `test_string` varchar(1024) DEFAULT NULL,
  `test_blob` longblob
) ENGINE=MyISAM DEFAULT CHARSET=latin1
1 row in set (0.00 sec)


(.virtualenv)~/projects/.virtualenv/src/ultramysql $ nosetests tests/tests.py 
.....................FF..
======================================================================
FAIL: testSelectUnicode (tests.TestMySQL)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/gsk/projects/.virtualenv/src/ultramysql/tests/tests.py", line 420, in testSelectUnicode
    self.assertEquals([(1, u'piet'), (2, s), (3, s)], result)
AssertionError: Lists differ: [(1, u'piet'), (2, u'r\xc3\xa4... != [(1, u'piet'), (2, u'r\xc3\x83...

First differing element 1:
(2, u'r\xc3\xa4ksm\xc3\xb6rg\xc3\xa5s')
(2, u'r\xc3\x83\xc2\xa4ksm\xc3\x83\xc2\xb6rg\xc3\x83\xc2\xa5s')

  [(1, u'piet'),
-  (2, u'r\xc3\xa4ksm\xc3\xb6rg\xc3\xa5s'),
+  (2, u'r\xc3\x83\xc2\xa4ksm\xc3\x83\xc2\xb6rg\xc3\x83\xc2\xa5s'),
?             ++++++++             ++++++++          ++++++++

-  (3, u'r\xc3\xa4ksm\xc3\xb6rg\xc3\xa5s')]
+  (3, u'r\xc3\x83\xc2\xa4ksm\xc3\x83\xc2\xb6rg\xc3\x83\xc2\xa5s')]
?             ++++++++             ++++++++          ++++++++


======================================================================
FAIL: testUnicodeUTF8 (tests.TestMySQL)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/gsk/projects/.virtualenv/src/ultramysql/tests/tests.py", line 589, in testUnicodeUTF8
    self.assertEquals([(1, peacesign_unicode), (2, peacesign_unicode)], result)
AssertionError: Lists differ: [(1, u'\u262e'), (2, u'\u262e'... != [(1, u'\xe2\x98\xae'), (2, u'\...

First differing element 0:
(1, u'\u262e')
(1, u'\xe2\x98\xae')

- [(1, u'\u262e'), (2, u'\u262e')]
+ [(1, u'\xe2\x98\xae'), (2, u'\xe2\x98\xae')]

----------------------------------------------------------------------
Ran 25 tests in 13.501s

FAILED (failures=2)

Hello--

I'm getting the same error, with the exact same error messages/differing elements/show create tables output.

The mysql distribution is just the one that comes on ubuntu with an apt-get install mysql-server:

Server version: 5.1.41-3ubuntu12.10 (Ubuntu)

The ubuntu version I'm on is i686 for 10.04
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 10.04.4 LTS
Release: 10.04
Codename: lucid

Haven't had a chance to check it on a x86_64 architecture.

It should be noted that the results returned from umysql appear to be what the utf8 encoding of the string would be if it was a unicode string. In other words check this out:

'\xe2\x98\xae'.decode('utf8')
u'\u262e'

and

'r\xc3\x83\xc2\xa4ksm\xc3\x83\xc2\xb6rg\xc3\x83\xc2\xa5s'.decode('utf8')
u'r\xc3\xa4ksm\xc3\xb6rg\xc3\xa5s'

Note sure if this helps track down something but I found it interesting :)

Try with latest release. Changes has been made to how BLOB are interpreted