Duplicate entries for tags with spÉcial characters
benbonnet opened this issue · 27 comments
Updated and all, but the problems remains.
If for example the word "Numérique" is saved, and if it already exists, acts_as_taggable breaks.
Same will go for "numerique".
Is there a way to handle it ?
You have to either provide a test case or backtrace.
If you are using Mysql, check your encoding.
Sorry for that
Here is a gist of what occurs : https://gist.github.com/bbnnt/7aadc59bea0edb24630d
"Numérique" exists and saves, but "Numerique" does not exists nor saves, breaks instead. It is actually because of the key, as far as I can understand, but I really don"t know how to solve it
Encoding of the table is UTF-8 unicode and collation is utf-8_general_ci
I can't reproduce this bug.
Can you reproduce it with a fresh database/app ?
Yes :/
I've done a rake db:drop db:create db:migrate
Then tried again, starting with only the command that you'l see in the gist / capture
https://gist.github.com/bbnnt/d5a5489182ac7bfe9be8
( I've highlighted the commands to make it clearer here http://cl.ly/image/1F391b0n3O1x?_ga=1.72731347.1360774124.1418737841 )
I checked again, the table has the encoding specified above, by default
Could reproduce this bug. It also happens with the Japanese space -> " "
Too bad it is not such an issue for the maintainer ad the logs provided pretty much shows that it does occurs
@bbnnt : Too bad it is not such an issue for the maintainer ad the logs provided pretty much shows that it does occurs
That not nice from you, this is an OpenSource project you can step up, fix it and send a PR.
Your argument will be valid if i refused/ignored to merge a fix.
@seuros that was not an argument; but more or less to get a reply from you (:
I'll get into this
👍
@bbnnt This project is really hard to maintain. It's a really old code base with lots of bugs, frequently reported issues, and none of the maintainers, as far as I know, use it anymore.
@bf4 damn i'm getting old-schooled ! would you recommend another gem that has similar functionalities ?
@bbnnt The fact that we don't use this gem don't mean we using a better alternative. It simply mean we don't need it.
@bbnnt the problem is related to the COLLATION actually applied to the 'name' column when a new tag name is about to be stored.
A tag 'name' is stored by the gem as 'binary encoded string', but if collation is not specified as 'utf8_bin' for that column, all the comparisons are not made properly and so the unicity constraint expressed by the index 'index_tags_on_name' generate the error you experienced.
For a quick circumvention, you could alter the 'tags' table column 'name', e.g. in MySql:
ALTER TABLE tags MODIFY name VARCHAR(255) CHARACTER SET utf8 COLLATE utf8_bin;
@rikettsie wow it seems to have solved the problem (:
thx a lot
You are welcome ;-)
@rikettsie : Could you update the readme and send a pr ?
@seuros @rikettsie @bbnnt Since the change_collation_for_tag_names now gets copied over together with the other migrations when calling 'rake acts_as_taggable_on_engine:install:migrations', is it still necessary to set the initializer 'ActsAsTaggableOn.force_binary_collation = true' as described in the readme?
Because if I do so, it will throw a MySQL error when running 'rake db:migrate', which doesn't happen when I do not set the initializer.
@Morred it seems that it's already included in the migrations. After running ./bin/rake db:migrate
part of the output was:
== 20150427105348 ChangeCollationForTagNames: migrating =======================
-- execute("ALTER TABLE tags MODIFY name varchar(255) CHARACTER SET utf8 COLLATE utf8_bin;")
-> 0.0512s
== 20150427105348 ChangeCollationForTagNames: migrated (0.0522s) ==============
@Morred can you provide the error you obtain while migrating?
@carlosescri, yes the behaviour is the same as setting force_binary_collation = true. Having the parameter exposed is useful because one can switch it to false/true as preferred independently of migration.
@carlosescri Yes, it's in the migrations that get copied over when you run "rake acts_as_taggable_on_engine:install:migrations".
@rikettsie If I set ActsAsTaggableOn.force_binary_collation = true in the initializer, I get this error when running "rake db:migrate":
-- execute("ALTER TABLE tags MODIFY name varchar(255) CHARACTER SET utf8 COLLATE utf8_bin;")
rake aborted!
ActiveRecord::StatementInvalid: Mysql2::Error: Table 'foo_development.tags' doesn't exist: ALTER TABLE tags MODIFY name varchar(255) CHARACTER SET utf8 COLLATE utf8_bin;
/Users/laura/foo/bar/config/initializers/acts_as_taggable_on.rb:1:in `<top (required)>'
/Users/laura/foo/bar/config/environment.rb:5:in `<top (required)>'
Mysql2::Error: Table 'foo_development.tags' doesn't exist
/Users/laura/foo/bar/config/initializers/acts_as_taggable_on.rb:1:in `<top (required)>'
/Users/laura/foo/bar/config/environment.rb:5:in `<top (required)>'
Tasks: TOP => db:migrate => environment
This happens only if the migrations that come from "rake acts_as_taggable_on_engine:install:migrations" haven't run yet, and therefore the tags table doesn't exist yet. If I remove the line in the initializer, the migrations run without a problem, and if I re-add the line after that, I can also run it without issues once the table exists. So I guess it tries to run the initializer before the actual migrations for some reason?
Yes @Morred, you should install and run migrations before adding the force_binary_collation parameter to the initializer file (otherwise the parameter gets executed when the application environment loads the first time and the table does not exist yet).
I can fix this side case.
@rikettsie That would be great. If it's only myself, I can run the migrations and then add that line, but if there are more people working on a project and somebody git clones the thing and then tries to run the migrations, they would have to remove the line, run the migrations and then add the line back into the initializer, which is somewhat inconvenient. A fix would be very much appreciated!
@Morred, I fixed it yesterday and the diff merged into master. It is ready for next version.
@rikettsie Awesome, thanks!
@bbnnt the problem is related to the COLLATION actually applied to the 'name' column when a new tag name is about to be stored.
A tag 'name' is stored by the gem as 'binary encoded string', but if collation is not specified as 'utf8_bin' for that column, all the comparisons are not made properly and so the unicity constraint expressed by the index 'index_tags_on_name' generate the error you experienced.For a quick circumvention, you could alter the 'tags' table column 'name', e.g. in MySql:
ALTER TABLE tags MODIFY name VARCHAR(255) CHARACTER SET utf8 COLLATE utf8_bin;
For anyone using the character set of utf8mb4
, the migration should be something like:
ALTER TABLE tags MODIFY name VARCHAR(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;
And in the case of MySQL 5.6:
ALTER TABLE tags MODIFY name VARCHAR(191) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;