As of 2.* versions of django-haystack
, the newest features in 5.* versions of Solr backend get django-haystack
little bit confused. But the two can still happily work together with a few bits of extra configuration.
You just need to tell Solr to lay out the config for your Solr index in the schema-based way, without that fancy data-driven schemaless mode that is being pushed on new Solr users by default. Schemaless mode has some considerable downsides anyways and won't let you add for example solr.PhoneticFilterFactory
analyzer to one of your field types as explicitly if you ever decide that you need one. Not to mention that, in the schemaless mode, all fields are suddenly multi-valued.
If you are a Solr 5.5+ user and Solr is having troubles with its schema, you could try the following after completing the setup steps
It seems that there is a new "managed schema" feature being introduced. I'm not sure exactly what the implication of this is (it seems to allow the modification of schemas through an API) but it needs to be disabled, otherwise the regular schema.xml
will not be detected, causing the above error.
To disable, open the solr config /usr/local/Cellar/solr/5.5.0/server/solr/MYCORE/conf/solrconfig.xml
and remove the following:
<!--
...
-->
<schemaFactory class="ManagedIndexSchemaFactory">
<bool name="mutable">true</bool>
<str name="managedSchemaResourceName">managed-schema</str>
</schemaFactory>
Alternatively, it seems that you could just replace it with:
<schemaFactory class="ManagedIndexSchemaFactory">
Optionally, you can also delete the "managed schema" config file:
rm -r /usr/local/Cellar/solr/5.5.0/server/solr/MYCORE/conf/managed-schema
You can skip the first step if you already have your Solr-as-a-service installed.
- Install Java if needed.
- Download
solr-<version>.tgz
from Solr Releases. - Extract the installation script from the archive and run it:
tar xzf solr-<version>.tgz solr-<version>/bin/install_solr_service.sh --strip-components=2
sudo bash ./install_solr_service.sh solr-<version>.tgz
- Installing Solr as a service lets it start automatically on every system boot; if required, you can restart the server with
sudo service solr restart
.
django-haystack
currently only supports Solr indexes based on a schema, which is stored inschema.xml
file of a Solr core's config, so create a core from the particular schema-based configuration called "basic_configs", which comes predefined with the Solr's installation:
sudo su - solr -c '/opt/solr/bin/solr create -c <core_name> -d basic_configs'
- In the command above, core creation needs to be performed on behalf of
solr
user because Solr would run into permission problems otherwise.
- Grab
solr.xml
file from this repository and put it into your Django project wheredjango-haystack
will be able to locate the template (more on this here), such as:
<project_name>/templates/search_configuration/solr.xml
solr.xml
is howdjango-haystack
likes to callschema.xml
templates.- Now you can use analyzers of your preference by modifying the template.
- You can see the
django-haystack
-specific config near the top of the template, so if you ever need to use another initial template, make sure to remove from your template the declarations for<field name="id" ...
and for<uniqueKey>
as these will be declared by thedjango-haystack
-specific config and copy thedjango-haystack
-specific config into the same spot in your template.
- Modify the settings for
django-haystack
in your Django settings to make it talk to a specific core:
HAYSTACK_CONNECTIONS = {
"default": {
"ENGINE": "haystack.backends.solr_backend.SolrEngine",
"URL": "http://127.0.0.1:8983/solr/<core_name>"
},
# ... other settings ...
}
- There is no need for manual
schema.xml
copying and Solr restarting whenever you rebuild your schema fordjango-haystack
as the schema can be put directly into the core's config and the core can then be reloaded, which is way faster than restarting the entire server, all with a single command:
python manage.py build_solr_schema --filename=/var/solr/data/<core_name>/conf/schema.xml && curl 'http://localhost:8983/solr/admin/cores?action=RELOAD&core=<core_name>&wt=json&indent=true'
- As before, change
<core_name>
to your Solr core's name. - You could also add
&& python manage.py rebuild_index
to the command if you've modified fields in yourSearchIndex
classes and want the changes to get reflected in the index for all model instances regardless of when they were or will be indexed. - You could place your other Solr config files under
search_configuration
, such assolrconfig.xml
orsynonyms.txt
, and sync it all together (with e.g.rsync --exclude=solr.xml ...
). - You will likely need to change the permissions on
/var/solr/data/<core_name>
to more liberal ones for the above line to execute.