Fonetica implements the BuscaBR algorithm to match misspelled or ambiguous names at Brazil.
One day I had to perform a phonetic search on a people database using the Soundex alghoritm but didn’t work for names at Brazil like “Wagner Batista” and “Vagner Baptista”.
Then Google suggested me to read the BuscaBR algorithm.
require 'fonetica' 'Wagner Batista'.foneticalize #=> "VM BT" 'Vagner Baptista'.foneticalize #=> "VM BT"
You can use the fonetica to search on ActiveRecord like this:
class Person < ActiveRecord::Base scope :search, lambda { |name| where("#{quoted_table_name}.fonetica LIKE ?", "#{name.foneticalize}%") } before_save :foneticalize protected def foneticalize self.fonetica = name.foneticalize end end
If you want to match any part, you should change scope to:
scope :search, lambda { |name| where("#{quoted_table_name}.fonetica LIKE ?", "%#{name.foneticalize}%") }
Remember to add a index on fonetica column.
Please ensure that you provide appropriate test coverage and ensure the documentation is up-to-date. Bonus points if you perform your changes in a clean topic branch rather than master, and if you create a pull request for your changes to be discussed and reviewed.
Please also keep your commits atomic so that they are more likely to apply cleanly. That means that each commit should contain the smallest possible logical change. Don’t commit two features at once, don’t update the gemspec at the same time you add a feature, don’t fix a whole bunch of whitespace in a file at the same time you change a few lines, etc, etc.
$ git clone http://github.com/sobrinho/fonetica $ cd fonetica $ bundle install $ rake test
Fonetica is hosted on Github: http://github.com/sobrinho/fonetica, where your contributions, forkings, comments and feedback are greatly welcomed.
Copyright © 2010 Gabriel Sobrinho, released under the MIT license.