Apache Solr


It is a search engine based on the Lucene project from The Apache Software Foundation. Main features:

  • Full-text search.

  • Optimization for high traffic sites.

  • Content negotiation: XML & JSON.

  • Control over web UI.

  • Scalable through node farm.

  • Really quick indexing (near real time).

  • On cluster configuration:

    • Central configuration for all nodes.
    • Automatic load balancing and fail-over for queries.

Some concepts before starting

Query The term used to filter indexed content to construct result.

Core An index installation with related files (configuration and transaction logs). A single Solr instance can handle multiple cores at the same time.

Facet An arrangement of search results into categories based on indexed terms. It can use query terms (as used in content) to filter facet results, field names and ranges.

Getting started

When Solr configured as servers cluster it is called SolrCloud

Starting instances

~/solr-6.3.0$ ./bin/solr start -e cloud
  • Web UI is ready on here.
  • See cloud architecture here.

Indexing a directory

~/solr-6.3.0$ ./bin/post -c gettingstarted docs/

Indexing formatted content

~/solr-6.3.0$ ./bin/post -c gettingstarted example/exampledocs/mem.xml
~/solr-6.3.0$ ./bin/post -c gettingstarted example/exampledocs/books.json

Each item in those files will be treated as a single document.

Querying content through CLI

~/solr-6.3.0$ curl "http://localhost:8983/solr/gettingstarted/select?indent=on&q=%22in%20Action%22&wt=json"
~/solr-6.3.0$ curl "http://localhost:8983/solr/gettingstarted/select?indent=on&q=%22in%20Action%22&wt=xml"

Querying facets' fields through CLI

~/solr-6.3.0$ curl 'http://localhost:8983/solr/gettingstarted/select?wt=json&indent=true&q=*:*&rows=0'\

Querying facet range through CLI

~/solr-6.3.0$ curl 'http://localhost:8983/solr/gettingstarted/select?q=*:*&wt=json&indent=on&rows=0'\

Solr vs ElasticSearch?

See comparation table.