/rediscogs

RediSearch demo based on data from discogs.com

Primary LanguageTypeScript

Rediscogs

Run the demo

git clone https://github.com/Redislabs-Solution-Architects/rediscogs.git
cd rediscogs
docker-compose up

Access the demo at http://localhost

💡
You will need a Discogs API token to have album covers displayed.
Use the following environment variable to pass it to Rediscogs:
export DISCOGS_API_TOKEN=<token>
docker-compose up

Demo Steps

RediSearch

  1. Launch redis-cli

  2. Show number of documents in RediSearch index:

    FT.INFO masters

  3. Run simple keyword search:

    FT.SEARCH masters java

    💡
    title is a phonetic text field so you will notice results containing words that sound similar
  4. Run prefix search:

    FT.SEARCH masters spring*

  5. Open http://localhost

  6. Enter some characters in the Artist field to retrieve suggestions from RediSearch (e.g. Dusty)

  7. Select an artist from the auto-complete options and click on the Submit button

  8. Refine the search by adding a numeric filter on release year in Query field:

    @year:[1960 1970]

  9. Refine the search further by adding a filter on release genres:

    @year:[1960 1970] @genres:{pop | rock}

Caching

  1. Select a different artist and hit Submit

  2. Notice how long it takes to load images from the Discogs API

  3. After all images have been loaded, click on the Submit button again

  4. Notice how fast the images are loading this time around

  5. In redis-cli show cached images:

    KEYS "images::*"

  6. Show type of a cached image:

    TYPE "images::319832"

  7. Display image bytes stored in String data structure:

    GET "images::319832"

Session Store

  1. Enter your name in the top right section of the page

  2. Choose an artist and hit Submit

  3. Click like on some of the returned albums

  4. Hit Submit again to refresh the list of albums

  5. Notice how your likes are kept in the current session

  6. In redis-cli show session-related keys:

    KEYS "spring:session:*"

  7. Choose a session entry and show its content:

    HGETALL "spring:session:sessions:d1e08957-6cee-49b6-81af-b21720d3c372"

Redis Streams

  1. Open http://localhost/#/likes in another browser window, side-by-side with the previous one

  2. In the search page click like on any album. Notice the likes showing up in real-time in the other browser window

  3. In a terminal window listen for messages on the stream:

    $ while true; do redis-cli XREAD BLOCK 0 STREAMS likes:stream $; done
    ...
    5) 1) "1557884829631-0"
       2)  1) "_class"
           2) "com.redislabs.rediscogs.model.LikeMessage"
           3) "album.id"
           4) "171410"
           5) "album.artist"
           6) "Lalo Schifrin"
           7) "album.artistId"
           8) "23165"
           9) "album.title"
          10) "Bullitt (Original Motion Picture Soundtrack)"
          11) "album.year"
          12) "1968"
          13) "album.like"
          14) "0"
          15) "album.genres.[0]"
          16) "Jazz"
          17) "album.genres.[1]"
          18) "Stage & Screen"
          19) "album.genres.[2]"
          20) "Soundtrack"
          21) "album.genres.[3]"
          22) "Smooth Jazz"
          23) "album.genres.[4]"
          24) "Jazz-Funk"
          25) "user.name"
          26) "Julien"
          27) "userAgent"
          28) "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_4) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/12.1 Safari/605.1.15"
          29) "time"
          30) "2019-05-15T01:47:09.629678Z"
  4. In redis-cli show the stats being maintained off the stream

127.0.0.1:6379> zrevrange stats:album 0 3 WITHSCORES
1) "You Don't Love Me"
2) "3"
3) "No. 1 In Your Heart"
4) "2"
5) "Bullitt (Original Motion Picture Soundtrack)"
6) "1"

Architecture

Getting Data In™

Discogs.com makes monthly dumps of their whole database available for download: data.discogs.com. The data is in XML format and formatted according to the discogs.com API spec.

For example the masters XML file looks like this:

<masters>
    <master id="12345">
    ...
    </master>
	<master id="15786">
		<artists>
			<artist>
				<id>8887</id>
				<name>Parliament</name>
			</artist>
		</artists>
		<genres>
			<genre>Funk / Soul</genre>
		</genres>
		<styles>
			<style>P.Funk</style>
		</styles>
		<year>1977</year>
		<title>Funkentelechy Vs. The Placebo Syndrome</title>
		<data_quality>Correct</data_quality>
	</master>
	...
</masters>

The ReDiscogs app streams in that Masters XML file using Spring Batch:


Getting Data In


On the RediSearch side, the masters index has the following schema created using the FT.CREATE command:

  • artist: Text field

  • artistId: Tag field

  • genres: Tag field

  • title: Phonetic Text field

  • year: Numeric field

Each master entry (i.e. album) is stored in RediSearch under that index using the FT.ADD command.

The data loaded previously is searchable via an Angular front-end accessing Spring Web services:


Search


Queries submitted by the user translate into a REST API call that in turn calls the FT.SEARCH command.

For each master returned from the search, ReDiscogs fetches the corresponding album cover image from the Discogs API and caches it in Redis using Spring Cache. Any album later returned by another search will have its image served from cache instead of the API, making access much faster and cheaper (the Discogs API is throttled at 60 calls per minute).