/acts_as_indexed

Acts As Indexed is a plugin which provides a pain-free way to add fulltext search to your Ruby on Rails app

Primary LanguageRubyMIT LicenseMIT

acts_as_indexed

If you find this plugin useful, please consider a donation to show your support!

www.paypal.com/cgi-bin/webscr?cmd=_send-money

Paypal address: dougal.s@gmail.com

Instructions

This plugin allows boolean-queried fulltext search to be added to any Rails app with no dependencies and minimal setup.

Resources

Installation

Add to your Gemfile

gem 'acts_as_indexed'

Run bundle install. Done.

Still on Rails 2.x.x without Bundler?

./script/plugin install git://github.com/dougal/acts_as_indexed.git

If you don’t have git installed, but still want the plugin, you can download the plugin from the GitHub page (github.com/dougal/acts_as_indexed) and unpack it into the vendor/plugins directory of your rails app.

Upgrading

When upgrading to a new version of acts_as_indexed it is recommended you delete the index directory and allow it to be rebuilt.

Usage

Setup

Add acts_as_indexed to the top of any models you want to index, along with a list of the fields you wish to be indexed.

class Post < ActiveRecord::Base
  acts_as_indexed :fields => [:title, :body]

  ...
end

The fields are not limited to model fields, but can be any instance method of the current model.

class User < ActiveRecord::Base
  acts_as_indexed :fields => [:address, :fullname]

  def fullname
    self.firstname + ' ' + self.lastname
  end

  ...
end

Any of the configuration options in the Further Configuration section can be added as to the acts_as_indexed method call. These will override any defaults or global configuration.

You can specify proc that needs to evaluate to true before the item gets indexed. This is useful if you only want items with a certain state to be included. The Proc is passed the current object’s instance so you are able to test against that.

For example, if you have a visible column that is false if the post is hidden, or true if it is visible, you can filter the index by doing:

class Post < ActiveRecord::Base
  acts_as_indexed :fields => [:title, :body], :if => Proc.new { |post| post.visible? }
  ...
end

Searching

With Relevance

To search with the most relevant matches appearing first, call the find_with_index method on your model, passing a query as the first argument. The optional ids_only parameter, when set to true, will return only the IDs of any matching records.

# Returns array of Post objects ordered by relevance.
my_search_results = Post.find_with_index('my search query')

# Pass any of the ActiveRecord find options to the search.
my_search_results = Post.find_with_index('my search query',{:limit => 10}) # return the first 10 matches.

# Returns array of IDs ordered by relevance.
my_search_results = Post.find_with_index('my search query',{},{:ids_only => true}) # =>  [12,19,33...

Without Relevance (Scope)

If the relevance of the results is not important, call the with_query named scope on your model, passing a query as an argument.

# Returns array of Post objects.
my_search_results = Post.with_query('my search query')

# Chain it with any number of ActiveRecord methods and named_scopes.
my_search_results = Post.public.with_query('my search query').find(:all, :limit => 10) # return the first 10 matches which are public.

Query Options

The following query operators are supported:

AND

This is the default option. ‘cat dog’ will find records matching ‘cat’ AND ‘dog’.

NOT

‘cat -dog’ will find records matching ‘cat’ AND NOT ‘dog’

INCLUDE

‘cat +me’ will find records matching ‘cat’ and ‘me’, even if ‘me’ is smaller than the min_word_size

“”

Quoted terms are matched as phrases. ‘“cat dog”’ will find records matching the whole phrase. Quoted terms can be preceded by the NOT operator; ‘cat -“big dog”’ etc. Quoted terms can include words shorter than the min_word_size.

^

Terms that begin with ^ will match records that contain a word starting with the term. ‘^cat’ will find matches containing ‘cat’, ‘catapult’, ‘caterpillar’ etc.

^“”

A quoted term that begins with ^ matches any phrase that begin with this phrase. ‘^“cat d”’ will find records matching the whole phrases “cat dog” and “cat dinner”. This type of search is useful for autocomplete inputs.

Pagination

With Relevance

Pagination is supported via the paginate_search method whose first argument is the search query, followed by all the standard will_paginate arguments.

@images = Image.paginate_search('girl', :page => 1, :per_page => 5)

Without Relevance (Scope)

Since with_query is a named scope, WillPaginate can be used in the normal fashion.

@images = Image.with_query('girl').paginate(:page => 1, :per_page => 5)

Further Configuration

A config block can be provided in your environment files or initializers. Example showing changing the min word size:

ActsAsIndexed.configure do |config|
  config.min_word_size = 3
  # More config as required...
end

A full rundown of the available configuration options can be found in lib/acts_as_indexed/configuration.rb

Heroku Support

Acts As Indexed supports Heroku out-of-the-box. The index is created in the tmp directory, which is the only writeable part of the Heroku dyno filesystem. Please read Heroku’s documentation( devcenter.heroku.com/articles/read-only-filesystem) regarding their file-system.

RDoc Documentation

View the rdoc documentation online.

Problems, Comments, Suggestions?

All of the above are most welcome. dougal.s@gmail.com

Contributors

A huge thanks to all the contributors to this library. Without them many bugfixes and features wouldn’t have happened.

  • Douglas F Shearer - douglasfshearer.com

  • Thomas Pomfret

  • Philip Arndt

  • Fernanda Lopes

  • Alex Coles

  • Myles Eftos

  • Edward Anderson

  • Florent Guilleux

  • Ben Anderson

  • Theron Toomey

  • Uģis Ozols

  • Gabriel Namiman

  • Roman Samoilov

  • David Turner

  • Pascal Hurni

  • Ryan Kopf

Unicode (UTF8) Support

At the moment acts_as_indexed only works with Unicode characters when used in the following way:

https://gist.github.com/193903bb4e0d6e5debe1

I have rewritten the tokenization process to allow easier handling of this in the future.