Task description

- Index movie lens content to Elasticsearch

movie content (movies.csv, ratings.csv, tags.csv)

You could use any way to index data:

Logstash (csv input, ... )
Java/Python/C# – Elastic client -> indexing

- Write console application which search movies

match phrase
fuzzy
filter/sort by average rating
finding top-10 tags for the movie
find movies which userX is put rating of 5).

NB: Try implement it using several approaches for working with hierarchical data and explain which one is the best fit here

The implementation

Tips for speed up ingestion

Build image for ingestion

sudo docker build -t movies_ingestion -f ingestion/Dockerfile .

Map data volume from here movie content (movies.csv, ratings.csv, tags.csv) and run ingestion

NB: ETA 30-40 min

docker run --network=host -it  -v $("pwd")/data:/app/data movies_ingestion

Build docker image

sudo docker build -t movies_searcher -f docker/Dockerfile .

Run image

docker run --network=host -it movies_searcher

Usage within container

For simplicity `movie` alias is used to run the app in container

- match phrase

Example

movie match-title "Toy story"

Output

- fuzzy

Example

movie fuzzy-title "Golang"

Output

- filter/sort by average rating

Example

movie top-movies --genre Adventure

Output

- finding top-10 tags for the movie

Example

movie movie-tags 100

Output

- find movies which userX is put rating of 5).

Example

movie user-top 1001

katridi/ELK-final_test

Task description

- Index movie lens content to Elasticsearch

- Write console application which search movies

NB: Try implement it using several approaches for working with hierarchical data and explain which one is the best fit here

The implementation

Build image for ingestion

Map data volume from here movie content (movies.csv, ratings.csv, tags.csv) and run ingestion

NB: ETA 30-40 min

Build docker image

Run image

Usage within container

For simplicity movie alias is used to run the app in container

- match phrase

Example

Output

- fuzzy

Example

Output

- filter/sort by average rating

Example

Output

- finding top-10 tags for the movie

Example

Output

- find movies which userX is put rating of 5).

Example

Output

For simplicity `movie` alias is used to run the app in container