/fonetica

BuscaBR algorithm which allow the comparison of words based on their phonetic likeness

Primary LanguageRubyMIT LicenseMIT

Fonetica

Fonetica implements the BuscaBR algorithm to match misspelled or ambiguous names at Brazil.

The Story

One day I had to perform a phonetic search on a people database using the Soundex alghoritm but didn’t work for names at Brazil like “Wagner Batista” and “Vagner Baptista”.

Then Google suggested me to read the BuscaBR algorithm.

Usage

  require 'fonetica'

  'Wagner Batista'.foneticalize #=> "VM BT"
  'Vagner Baptista'.foneticalize #=> "VM BT"

Using with ActiveRecord

You can use the fonetica to search on ActiveRecord like this:

  class Person < ActiveRecord::Base
    scope :search, lambda { |name| where("#{quoted_table_name}.fonetica LIKE ?", "#{name.foneticalize}%") }

    before_save :foneticalize

    protected

    def foneticalize
      self.fonetica = name.foneticalize
    end
  end

If you want to match any part, you should change scope to:

  scope :search, lambda { |name| where("#{quoted_table_name}.fonetica LIKE ?", "%#{name.foneticalize}%") }

Remember to add a index on fonetica column.

How to contribute

Please ensure that you provide appropriate test coverage and ensure the documentation is up-to-date. Bonus points if you perform your changes in a clean topic branch rather than master, and if you create a pull request for your changes to be discussed and reviewed.

Please also keep your commits atomic so that they are more likely to apply cleanly. That means that each commit should contain the smallest possible logical change. Don’t commit two features at once, don’t update the gemspec at the same time you add a feature, don’t fix a whole bunch of whitespace in a file at the same time you change a few lines, etc, etc.

Development environment

  $ git clone http://github.com/sobrinho/fonetica
  $ cd fonetica
  $ bundle install
  $ rake test

Project info

Fonetica is hosted on Github: http://github.com/sobrinho/fonetica, where your contributions, forkings, comments and feedback are greatly welcomed.

Copyright © 2010 Gabriel Sobrinho, released under the MIT license.