These are some examples of embeddings using OpenAI models.
Examples are here because I find each to be of some interest, but this is not intended as a tutorial for how to use embeddings.
For instructive examples, see the official OpenAI repository openai-cookbook.
This repository, bed, is like bedj, but this is in Python (with Jupyter notebooks) and is more extensive.
Summary forthcoming. For now, look at the descriptions at the top of each notebook.
The examples are written to assume your API key is in a file called .api_key
.
Do not commit it to Git! The .gitignore
file excludes it, to help avoid that.
One interesting technique shown here is storing the embeddings as rows of a matrix, then finding similarities with matrix multiplication.
The second operand can be an embedding, in which case we are multiplying a matrix by a column vector, which is the same as taking the dot products of all of the matrix's rows with the vector (to make the new coordinates of the resulting vector).
If the second operand is a matrix whose columns are embeddings, then each (i, j) entry of the resulting matrix is the dot product of the ith row of the first matrix by the jth row of the second matrix, i.e., the similarities of those embeddings.
The dot products are the cosine similarities with OpenAI embeddings, and with embeddings from some other non-OpenAI models (but not all), because many embedding models, including all OpenAI models, produce embeddings that are already normalized (length 1).