Feature request: support unicode characters.
parro-it opened this issue · 2 comments
If I try to build an update that set a a field to a string containing unicode characters (in this example I put an emoji) I get this error:
(node:29099) UnhandledPromiseRejectionWarning: Error: ER_TRUNCATED_WRONG_VALUE_FOR_FIELD: Incorrect string value: '\xF0\x9F\x98\x98' for column 'title' at row 1
I'm using mysql, should I use an extension for mysql or something?
I get the same error if I try to run the generated query directly in Sequel Pro
, so I think the emoji symbol has to be escaped to unicode, eg. '\\u00F0\\u009F\\u0098\\u0098'
I'm getting this results using toParams method:
{ text: 'UPDATE recipes SET id = ?, title = ?, body = ? WHERE id = ?',
values:
[ '1',
'fagioli alla messicana 😗',
'Apri i fagioli, aggiungi la salsa e l\'aglio, cuoci. \r\nA fuoco lento o rovini tutto. \r\nPepa tanto.',
'1' ] }
@parro-it, sorry for the delay responding.
I'm not sure if this is the issue you're running into, but apparently MySQL's "utf8" encoding doesn't handle 4-byte characters (like your emoji example); to get it to support four-byte characters, you have to switch it from "utf8" to "utf8mb4" (which supports all utf8 characters): In MySQL, never use “utf8”. Use “utf8mb4”.
I don't think sql-bricks should escape unicode characters, since it's often unnecessary and it's DB/connection-specific when it is, so it should usually be done in the DB driver (node-mysql
, etc), or if not there, in application code.
If you do end up needing to do it in application code and you need a centralized, convenient place, one option would be for your application to wrap SQLBrick's toString()
or toParams()
. If you need help figuring out how to do that, or if it seems like it's not possible to do that (I'm not 100% sure off the top of my head), let me know & I should be able to help with that.
@parro-it, quick follow-up, it looks like node-mysql
will allow you to specify {charset: 'utf8mb4'}
in the connection options & that they're planning to make this the default in v3.0: Changes default encoding from UTF8_GENERAL_CI to UTF8MB4_UNICODE_CI.
However, it sounds like that alone isn't enough -- you also need to convert your database to this encoding: How to support full Unicode in MySQL databases.