Run embedding models locally in Swift
using MLTensor
.
Inspired by mlx-embeddings.
Some of the supported models on Hugging Face
:
- sentence-transformers/all-MiniLM-L6-v2
- sentence-transformers/msmarco-bert-base-dot-v5
- thenlper/gte-base
Some of the supported models on Hugging Face
:
- sentence-transformers/paraphrase-multilingual-mpnet-base-v2
- tomaarsen/xlm-roberta-base-multilingual-en-ar-fr-de-es-tr-it
NOTE: only text encoding is supported for now.
Some of the supported models on Hugging Face
:
Add the following to your Package.swift
file. In the package dependencies add:
dependencies: [
.package(url: "https://github.com/jkrukowski/swift-embeddings", from: "0.0.5")
]
In the target dependencies add:
dependencies: [
.product(name: "Embeddings", package: "swift-embeddings")
]
import Embeddings
// load model and tokenizer from Hugging Face
let modelBundle = try await Bert.loadModelBundle(
from: "sentence-transformers/all-MiniLM-L6-v2"
)
// encode text
let encoded = modelBundle.encode("The cat is black")
let result = await encoded.cast(to: Float.self).shapedArray(of: Float.self).scalars
// print result
print(result)
import Embeddings
import MLTensorUtils
let texts = [
"The cat is black",
"The dog is black",
"The cat sleeps well"
]
let modelBundle = try await Bert.loadModelBundle(
from: "sentence-transformers/all-MiniLM-L6-v2"
)
let encoded = modelBundle.batchEncode(texts)
let distance = cosineDistance(encoded, encoded)
let result = await distance.cast(to: Float.self).shapedArray(of: Float.self).scalars
print(result)
To run the command line demo, use the following command:
swift run embeddings-cli <subcommand> [--model-id <model-id>] [--text <text>] [--max-length <max-length>]
Subcommands:
bert Encode text using BERT model
clip Encode text using CLIP model
xlm-roberta Encode text using XLMRoberta model
Command line options:
--model-id <model-id> Id of the model to use
--text <text> Text to encode
--max-length <max-length> Maximum length of the input
-h, --help Show help information.
This project uses swift-format. To format the code run:
swift format . -i -r --configuration .swift-format
This project is based on and uses some of the code from: