PedroChaparro/PI202202-alako-backend

This repository contains all the files related to project's back-end and search algorithm.

Shell

PI202202-alako-backend

This repository contains all the files related to project's data collection, data normalization / cleansing and database management.

🎨 You can find front-end repository here
📜 You can find data repository here

Subjects / topics

This project includes the following college subjects: Web development, TI design and management, Artificial Intelligence.

Tech Stack

Docker, docker-compose.
Python: sentence-transformers, Bottle Web Framework.
Scala: Play Web Framework, org.apache.spark.sql, org.apache.spark.launcher.SparkLauncher.
Apache Spark.
Apache Hadoop.

Results

Docker images and docker-compose files (See this folder for more details):

Vectorize api: Get user query's embeddings read this for theorical context and view the used code here.

Search algorithm (See this folder for more details):

The following is an example of the results obtained after executing the cosine-similarity algorithm on Apache Spark cluster: