Solr Index for the The Movie Database.
This repository is part of the Think Like a Relevancy Engineer training provided by OpenSource Connections.
- Download this repo
- Install Solr search engine and configuration (using either Docker or installing manually)
- Index the TMDB movie data
- Confirm Solr has the data
- Install Postman (optional)
Download the zip from https://github.com/o19s/solr-tmdb/archive/master.zip, and
you will get the file solr-tmdb-master.zip
. Unzip this file, resulting in the
directory solr-tmdb-master
.
After you have this download, change into the newly created directory.
Two options exist to run Solr locally, however if neither of them will work for you, we do have a public version of this dataset deployed at http://quepid-solr.dev.o19s.com:8985/solr/ that you can use during the class as well, so don't fret if your environment won't let you set up Solr!
If you have Docker installed and running.
Linux/OSX:
./docker.sh
Windows:
powershell docker.ps1
-
Download and unpack Solr 8.11.1
-
Navigate into the newly unzipped directory.
-
Open
/path/to/solr-tmdb-master/solr_home/tmdb/conf/solrconfig.xml
and change the path to include the extra libraries located in/path/to/solr-tmdb-master/docker/lib
. -
Run Solr pointing at the TMDB Solr Home directory included in this repo.
Linux/OSX:
bin/solr start -f -s /path/to/solr-tmdb-master/solr_home/
Windows:
bin\solr start -f -s \path\to\solr-tmdb-master\solr_home\
Regardless of the option you choose, navigate to http://localhost:8983/solr/ to confirm Solr is running.
We have a movie data corpus sourced from The Movie Database, similar data to IMDB (Internet Movie Database).
Linux/OSX:
./index.sh
Windows:
powershell index.ps1
If you get a permissions error, just open the index.ps1 file and copy and paste the contents into your Powershell console
You are indexing a 12 mb JSON file, so this will take a minute!
Navigate here and confirm you get results.
If you don't see any results, trigger a manual commit.
Postman is an API development tool, that helps build, run and manage API requests. The examples from the TLRE slides exist here too as a Postman Collection (solr-postman_collection.json
). We like using Postman because it makes tinkering with query parameters nicer and we think it is a useful way to follow along as you learn about tuning search relevance.
If you want to use Postman during the TLRE class:
- Download Postman for your OS
- Open Postman and Import (top-menu >> File)
solr-postman-collection.json
- Define a global variable (grey eye icon in the upper-right)
solr_host
to point to your running Solr instance (default islocalhost:8983
) - Tinker with the base URL, Params or JSON Body (optional)
- Press 'Send' (blue rectangle button right of URL bar)