/textacular

Textacular exposes full text search capabilities from PostgreSQL, and allows you to declare full text indexes. Textacular will extend ActiveRecord with named_scope methods making searching easy and fun!

Primary LanguageRuby

textacular

Gem Version Build Status

DESCRIPTION:

Textacular exposes full text search capabilities from PostgreSQL, extending ActiveRecord with scopes making search easy and fun!

FEATURES/PROBLEMS:

  • Only works with PostgreSQL
  • Anything that mucks with the SELECT statement (notably pluck), is likely to cause problems.

SYNOPSIS:

Quick Start

In the project's Gemfile add

gem 'textacular', '~> 5.0'

Rails 3, Rails 4

In the project's Gemfile add

gem 'textacular', '~> 4.0'

ActiveRecord outside of Rails

require 'textacular'

ActiveRecord::Base.extend(Textacular)

Usage

Your models now have access to search methods:

The #basic_search method is what you might expect: it looks literally for what you send to it, doing nothing fancy with the input:

Game.basic_search('Sonic') # will search through the model's :string columns
Game.basic_search(title: 'Mario', system: 'Nintendo')

The #advanced_search method lets you use Postgres's search syntax like '|', '&' and '!' ('or', 'and', and 'not') as well as some other craziness. The ideal use for advanced_search is to take a search DSL you make up for your users and translate it to PG's syntax. If for some reason you want to put user input directly into an advanced search, you should be sure to catch exceptions from syntax errors. Check [the Postgres docs] (http://www.postgresql.org/docs/9.2/static/datatype-textsearch.html) for more:

Game.advanced_search(title: 'Street|Fantasy')
Game.advanced_search(system: '!PS2')

The #web_search method lets you use Postgres' 11+ websearch_to_tsquery function supporting websearch like syntax:

  • unquoted text: text not inside quote marks will be converted to terms separated by & operators, as if processed by plainto_tsquery.
  • "quoted text": text inside quote marks will be converted to terms separated by <-> operators, as if processed by phraseto_tsquery.
  • OR: logical or will be converted to the | operator.
  • -: the logical not operator, converted to the the ! operator.
Game.web_search(title: '"Street Fantasy"')
Game.web_search(title: 'Street OR Fantasy')
Game.web_search(system: '-PS2')

Finally, the #fuzzy_search method lets you use Postgres's trigram search functionality.

In order to use this, you'll need to make sure your database has the pg_trgm module installed. Create and run a migration to install the module:

rake textacular:create_trigram_migration
rake db:migrate

Once that's installed, you can use it like this:

Comic.fuzzy_search(title: 'Questio') # matches Questionable Content

Note that fuzzy searches are subject to a similarity threshold imposed by the pg_trgm module. The default is 0.3, meaning that at least 30% of the total string must match your search content. For example:

Comic.fuzzy_search(title: 'Pearls') # matches Pearls Before Swine
Comic.fuzzy_search(title: 'Pear') # does not match Pearls Before Swine

The similarity threshold is hardcoded in PostgreSQL and can be modified on a per-connection basis, for example:

ActiveRecord::Base.connection.execute("SELECT set_limit(0.9);")

For more info, view the pg_trgm documentation, specifically F.35.2. Functions and Operators.

Searches are also chainable:

Game.fuzzy_search(title: 'tree').basic_search(system: 'SNES')

If you want to search on two or more fields with the OR operator use a hash for the conditions and pass false as the second parameter:

Game.basic_search({name: 'Mario', nickname: 'Mario'}, false)

Setting Language

To set proper searching dictionary just override class method on your model:

def self.searchable_language
  'russian'
end

And all your queries would go right! And don`t forget to change the migration for indexes, like shown below.

Setting Searchable Columns

To change the default behavior of searching all text and string columns, override the searchable_columns class method on your model:

def self.searchable_columns
  [:column1, :column2]
end

Creating Indexes for Super Speed

You can have Postgresql use an index for the full-text search. To declare a full-text index, in a migration add code like the following:

For basic_search

add_index :email_logs, %{to_tsvector('english', subject)}, using: :gin
add_index :email_logs, %{to_tsvector('english', email_address)}, using: :gin

For fuzzy_search

add_index :email_logs, :subject, using: :gist, opclass: :gist_trgm_ops
add_index :email_logs, :email_address, using: :gist, opclass: :gist_trgm_ops

In the above example, the table email_logs has two text columns that we search against, subject and email_address. You will need to add an index for every text/string column you query against, or else Postgresql will revert to a full table scan instead of using the indexes.

REQUIREMENTS:

  • ActiveRecord
  • Ruby 1.9.2

INSTALL:

$ gem install textacular

Contributing

If you'd like to contribute, please see the contribution guidelines.

Releasing

Maintainers: Please make sure to follow the release steps when it's time to cut a new release.

LICENSE:

(The MIT License)

Copyright (c) 2011 Aaron Patterson

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the 'Software'), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED 'AS IS', WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.