luisniebla/spotsummary

Python

source ./api/bin/activate pip install -r requirements.txt gcloud auth application-default login python3 api/init.py

cd nextapp npm run dev

http://localhost:3000/

curl http://localhost:5000/embeddings \
  -H "Content-Type: application/json" \
  -d '{
    "input": "Your text string goes here",
    "model": "text-embedding-ada-002"
  }'

What are embeddings

Text embeddings measure how related text strings are to each other. They are commonly used for recommendation, clustering, and classification.
Embeddings are simple vectors of floating point numbers, with distances between two points indicating their relatedness

Price:

Model Usage Ada v2 $0.0004 / 1K tokens

The guide https://platform.openai.com/docs/guides/embeddings/use-cases has a bunch of good use cases. Notably, we will try following this guide for learning how to compare text

https://github.com/openai/openai-cookbook/blob/main/text_comparison_examples.md

Can't really use embeddings because that would be for review to review. Like if you're looking at one review or location and you want to compare to others similar, it makes sense... Otherwise?

Implementation Ideas

Don't store any maps data, pass onto the model to create the recommendation
Store the embeddings in a vector engine

TODO

Setup embeddings for search

Split text into chunks smaller than token limit
Embed each chunk
Store embeddings in vector engine

OpenAI recommends fine-tuning versus embedding for text classification. When we're going to do tags, maybe we look into that later

Implemnation Ideas

Do not store any google maps data, feed it into the embedding as part of the string input.
Store the embeddings in a vector engine
User input for search is used as input to the embedding model, then retrive distance from existing embeddings within the search area
Some sort of clustering for places.

Dev stuff

docker-compose
db setup
Vercel deployment
Hook up front and back-end