/PI202202-alako-backend

This repository contains all the files related to project's back-end and search algorithm.

Primary LanguageShell

PI202202-alako-backend

This repository contains all the files related to project's data collection, data normalization / cleansing and database management.

  • 🎨 You can find front-end repository here
  • 📜 You can find data repository here

Subjects / topics

This project includes the following college subjects: Web development, TI design and management, Artificial Intelligence.

Tech Stack

  • Docker, docker-compose.
  • Python: sentence-transformers, Bottle Web Framework.
  • Scala: Play Web Framework, org.apache.spark.sql, org.apache.spark.launcher.SparkLauncher.
  • Apache Spark.
  • Apache Hadoop.

Results

  • Docker images and docker-compose files (See this folder for more details):

  • Vectorize api: Get user query's embeddings read this for theorical context and view the used code here.

  • Search algorithm (See this folder for more details):

The following is an example of the results obtained after executing the cosine-similarity algorithm on Apache Spark cluster: