huggingface/text-embeddings-inference

A blazing fast inference solution for text embeddings models

RustApache-2.0

Pinned issues

Text Embeddings Inference is now Open Source!

#232 opened 7 months ago by OlivierDehaene

Open1

Issues

Support jinaai/jina-embeddings-v3
#418 opened 2 months ago by luonist
4
Using CPU Image without ONNX
#388 opened 3 months ago by ramipellumbi
0
Allow fetching token embeddings from a cross-encoding
#407 opened 2 months ago by dan-octo
1
Health check endpoint is also locked behind API KEY once API_KEY is set, causing healthchecks to fail
#427 opened a month ago by Jason-CKY
2
Download of BAAI/bge-m3 fails on 1.5 using ONNX
#417 opened 2 months ago by avvertix
4
Inconsistent Embeddings with SentenceTransformer
#420 opened a month ago by lytning98
2
Could auto-truncate also deal with character limits?
#364 opened a month ago by qherreros
0
Better HTTP Status Code for Empty Requests
#365 opened a month ago by mbbyn
5
Support Alibaba-NLP/gte-multilingual-base & Alibaba-NLP/gte-multilingual-reranker-base
#366 opened a month ago by sigridjineth
4
Input validation error: inputs must have less than 512 tokens
#369 opened a month ago by ellahe-git
5
C API/ C Wrapper API
#370 opened a month ago by 0110G
1
how to support a SequenceClassification model
#371 opened a month ago by homily707
1
/embed_sparse Parameters
#372 opened a month ago by dbc-2024
0
Local intall failure
#373 opened a month ago by plageon
3
Automatically split batches > MAX_CLIENT_BATCH_SIZE
#378 opened a month ago by leversberg
1
How do i deploy to vertex ?
#380 opened a month ago by pulkitmehtaworkmetacube
1
Question for released docker image
#383 opened a month ago by PeterYang12
1
TEI fails for Finetuned JinaAI Embeddings models
#384 opened a month ago by StefanRaab
2
Incomplete tutorial
#395 opened a month ago by sleepingcat4
1
Support NV-Embed-v2 model
#419 opened 2 months ago by jorgeantonio21
1
thread 'tokio-runtime-worker' panicked at /usr/src/backends/src/lib.rs:176:14
#424 opened a month ago by jackli0127
2
Get opentelemetry trace id from request headers instead of creating a new trace
#374 opened 3 months ago by ptanov
2
Use Nomic Embed from local weights
#422 opened a month ago by sinaayyy
0
new model lier007/xiaobu-embedding-v2
#423 opened a month ago by Mars-1990
0
只是支持苹果系统吗
#421 opened a month ago by Joker-sad
0
Feature Request: Multi-GPU inference or the ability to choose a GPU at startup
#411 opened a month ago by dangerzone
2
Support env `HF_ENDPOINT`?
#416 opened 2 months ago by yufeng97
1
Add support for other pooling methods in python backend
#414 opened 2 months ago by OskarLiew
0
Different behavior between SentenceTransformer and TEI when using gte-large-en-v1.5
#358 opened 2 months ago by Smityz
5
Too many router/tokenizer threads
#404 opened 2 months ago by askervin
1
Too many model backend threads destroy performance when running on CPU
#405 opened 2 months ago by askervin
0
Paged attention optimization for memory efficient continuously batched requests
#391 opened 2 months ago by jorgeantonio21
1
CUDA_COMPUTE_CAP doesn't take effect in Dockerfile-cuda
#381 opened 2 months ago by PeterYang12
2
Unsupported model IR version
#355 opened 2 months ago by netw0rkf10w
1
serving bge-reranker-v2-m3
#408 opened 2 months ago by gree2
0
Max Token Mismatch
#406 opened 2 months ago by endomorphosis
1
unknown flag: --gpus See 'docker run --help'
#397 opened 3 months ago by sswolf
1
Inconsistency in how different URL paths are handled (in inference endpoints)
#398 opened 3 months ago by MoritzLaurer
0
dunzhang/stella_en_1.5B_v5 Maximum Token Limit Set to 512 Despite Model Capabilities
#396 opened 3 months ago by taoari
0
Input validation error: `inputs` must have less than 32000 characters. Given: 67337
#394 opened 3 months ago by ffalkenberg
0
can support bge-visualized ?
#393 opened 3 months ago by whybeyoung
0
Adding support for stella_en_400M_v5
#359 opened 4 months ago by netw0rkf10w
2
Input validation error: `inputs` must have less than 512 tokens. Given: 534
#356 opened 4 months ago by gctian
6
add support bge-reranker-v2.5-gemma-lightweight
#368 opened 4 months ago by ziozzang
2
`bge-reranker-v2-m3` model throughput benchmark
#392 opened 3 months ago by rere950303
0
Request support for Llama Prompt Guard
#354 opened 4 months ago by bluenevus
3
curl: (56) Recv failure: Connection reset by peer
#387 opened 3 months ago by luyu0816
1
TEI failed to serve fine-tuned bge-m3 model
#385 opened 3 months ago by KCFindstr
0
Support Alibaba-NLP/gte-large-en-v1.5 on CPU/MPS
#375 opened 3 months ago by tmostak
0
[CPU][Performance] ort is slower than python backend
#362 opened 4 months ago by zhuhaozhe
0