taeminlee's Stars
MinishLab/model2vec
Distill a Small Static Model from any Sentence Transformer
ictnlp/LLaMA-Omni
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
m-bain/whisperX
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
matheusbach/legen
Uses AI to locally transcribes speech from media files, generating subtitle files, translates the generated subtitles, inserts them into the mp4 container, and burns them directly into video
MahmoudAshraf97/whisper-diarization
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
JuergenFleiss/aTrain
A GUI tool for offline transcription of speech recordings, including speaker diarization, utilizing state-of-the-art machine learning models.
g8a9/ferret
A python package for benchmarking interpretability techniques on Transformers.
superheavytail/pklue
Converts standard Korean dataset to instruction-tuning available format.
Marker-Inc-Korea/AutoRAG
AutoML tool for RAG
pymupdf/RAG
RAG (Retrieval-Augmented Generation) Chatbot Examples Using PyMuPDF
unslothai/unsloth
Finetune Llama 3.2, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory
danny-avila/LibreChat
Enhanced ChatGPT Clone: Features Anthropic, AWS, OpenAI, Assistants API, Azure, Groq, o1, GPT-4o, Mistral, OpenRouter, Vertex AI, Gemini, Artifacts, AI model switching, message search, langchain, DALL-E-3, ChatGPT Plugins, OpenAI Functions, Secure Multi-User System, Presets, completely open-source for self-hosting. Actively in public development.
rladmstn1714/CLIcK
CLIcK: A Benchmark Dataset of Cultural and Linguistic Intelligence in Korean
superheavytail/lm-evaluation-by-openai
A framework for benchmarking model's instruction following ability
argilla-io/distilabel
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
anuraghazra/github-readme-stats
:zap: Dynamically generated stats for your github readmes
Spico197/Humback
🐋 An unofficial implementation of Self-Alignment with Instruction Backtranslation.
meta-llama/llama3
The official Meta Llama 3 GitHub site
PygmalionAI/aphrodite-engine
Large-scale LLM inference engine
UpstageAI/evalverse
The Universe of Evaluation. All about the evaluation for LLMs.
Vaibhavs10/insanely-fast-whisper
kamalkraj/e5-mistral-7b-instruct
Finetune mistral-7b-instruct for sentence embeddings
michaelfeil/infinity
Infinity is a high-throughput, low-latency REST API for serving text-embeddings, reranking models, clip, clap and colpali
kyegomez/BitNet
Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch
HeegyuKim/ko-rm-judge
Reward Model을 이용하여 언어모델의 답변을 평가하기
Marker-Inc-Korea/RAGchain
Extension of Langchain for RAG. Easy benchmarking, multiple retrievals, reranker, time-aware RAG, and so on...
for-ai/parameter-efficient-moe
google/seqio
Task-based datasets, preprocessing, and evaluation for sequence models.
Adaxry/Post-Instruction
Anil-matcha/ChatPDF
Chat with any PDF. Easily upload the PDF documents you'd like to chat with. Instant answers. Ask questions, extract information, and summarize documents with AI. Sources included.