whybe-choi's Stars
huggingface/smol-course
A course on aligning smol models.
academicpages/academicpages.github.io
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes
wikibook/llm-finetuning
《한 권으로 끝내는 실전 LLM 파인튜닝》 예제 코드
DSBA-Lab/Contrastive-Accumulation
philschmid/deep-learning-pytorch-huggingface
beir-cellar/beir
A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
andrewyng/aisuite
Simple, unified interface to multiple Generative AI providers
baeseongsu/KoSAIM2024-Clinical-LLM
[KoSAIM 2024 Summer School] Fine-tuning a clinical domain Large Language Model
ritaranx/BMRetriever
[EMNLP 2024] This is the code for our paper "BMRetriever: Tuning Large Language Models as Better Biomedical Text Retrievers".
cfahlgren1/observers
A Lightweight Library for AI Observability
naver/splade
SPLADE: sparse neural search (SIGIR21, SIGIR22)
haon-chen/SPEED
jessevig/bertviz
BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)
embeddings-benchmark/mteb
MTEB: Massive Text Embedding Benchmark
PrithivirajDamodaran/FlashRank
Lite & Super-fast re-ranking for your search & retrieval pipelines. Supports SoTA Listwise and Pairwise reranking based on LLMs and cross-encoders and more. Created by Prithivi Da, open for PRs & Collaborations.
AIR-Bench/AIR-Bench
AIR-Bench: Automated Heterogeneous Information Retrieval Benchmark
jxmorris12/cde
code for training & evaluating Contextual Document Embedding models
wandb/llm-kr-eval
microsoft/unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
LeeSureman/E5-Retrieval-Reproduction
Use contrastive learning to train a large language model (LLM) as a retriever
hkust-nlp/SynCSE
This is the official implementation of the paper: "Contrastive Learning of Sentence Embeddings from Scratch"
staoxiao/RetroMAE
Codebase for RetroMAE and beyond.
facebookresearch/mexma
MEXMA: Token-level objectives improve sentence representations
vec2text/vec2text
utilities for decoding deep representations (like sentence embeddings) back to text
mrdbourke/simple-local-rag
Build a RAG (Retrieval Augmented Generation) pipeline from scratch and have it all run locally.
mlfoundations/task_vectors
Editing Models with Task Arithmetic
songys/huggingface_KoreanDataset
huggingface에 있는 한국어 데이터 세트
MinishLab/model2vec
The Fastest State-of-the-Art Static Embeddings in the World
worldbank/GISTEmbed
GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embeddings
HKUDS/LightRAG
"LightRAG: Simple and Fast Retrieval-Augmented Generation"