한국어 임베딩 모델의 검색 (Information Retrieval) 성능 비교를 위한 리더보드입니다. 현재 Ko-StrategyQA 데이터셋을 사용해 IR 성능을 측정했습니다.
Rank | Model | NDCG@10 | MRR@10 | MAP@10 | Recall@10 | Average | Link |
---|---|---|---|---|---|---|---|
1 | Upstage/solar-embedding-1-large | 83.61 | 84.14 | 80.35 | 88.35 | 84.11 | 📎 |
2 | Cohere/embed-multilingual-v3.0 | 80.79 | 82.33 | 76.80 | 85.96 | 81.47 | 📎 |
3 | intfloat/multilingual-e5-large-instruct | 80.68 | 82.52 | 76.61 | 85.74 | 81.39 | 📎 |
4 | intfloat/multilingual-e5-large | 80.35 | 82.01 | 76.37 | 85.40 | 81.03 | 📎 |
5 | Voyage/voyage-multilingual-2 | 79.69 | 80.53 | 75.24 | 86.43 | 80.47 | 📎 |
6 | BAAI/bge-m3 | 79.40 | 81.12 | 75.07 | 85.20 | 80.20 | 📎 |
7 | intfloat/multilingual-e5-base | 76.35 | 79.04 | 71.60 | 82.06 | 77.26 | 📎 |
8 | intfloat/multilingual-e5-small | 75.16 | 77.55 | 69.78 | 82.17 | 76.16 | 📎 |
9 | Cohere/embed-multilingual-light-v3.0 | 74.22 | 76.31 | 68.91 | 81.26 | 75.17 | 📎 |
10 | OpenAI/text-embedding-3-large | 73.74 | 75.70 | 68.47 | 80.99 | 74.73 | 📎 |
11 | upskyy/kf-deberta-multitask | 68.42 | 70.14 | 61.64 | 78.71 | 69.73 | 📎 |
12 | cateto/longformer-ko-sroberta-multitask-23040 | 66.31 | 68.51 | 59.34 | 76.62 | 67.69 | 📎 |
13 | jhgan/ko-sroberta-multitask | 65.10 | 67.46 | 58.03 | 75.41 | 66.50 | 📎 |
14 | jhgan/ko-sroberta-nli | 63.90 | 66.56 | 56.84 | 74.09 | 65.35 | 📎 |
15 | jhgan/ko-sbert-multitask | 62.86 | 65.60 | 55.61 | 73.34 | 64.35 | 📎 |
16 | BM-K/KoSimCSE-bert-multitask | 56.93 | 59.17 | 49.21 | 69.03 | 58.59 | 📎 |
17 | BM-K/KoSimCSE-roberta-multitask | 56.32 | 58.31 | 48.47 | 68.66 | 57.94 | 📎 |
18 | OpenAI/text-embedding-3-small | 56.33 | 59.75 | 49.64 | 65.53 | 57.81 | 📎 |
19 | sentence-transformers/paraphrase-multilingual-mpnet-base-v2 | 53.03 | 54.49 | 46.52 | 63.25 | 54.32 | 📎 |
20 | sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 | 45.90 | 46.90 | 39.26 | 57.04 | 47.28 | 📎 |
21 | kakaobank/kf-deberta-base | 41.25 | 43.82 | 33.45 | 53.42 | 42.99 | 📎 |
- Website 제작
- PR 가이드라인 제작
Cohere, Upstage, Voyage 결과 추가- 벤치마크용 데이터셋 추가