ChTauchmann

ChTauchmann's Stars

meta-llama/llama-recipes
Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama3 for WhatsApp & Messenger.
Language:Jupyter Notebook11.5k 91 3251.6k
FlagOpen/FlagEmbedding
Retrieval and Retrieval-augmented LLMs
Language:Python6.7k 41 973480
weaviate/Verba
Retrieval Augmented Generation (RAG) chatbot powered by Weaviate
Language:TypeScript5.9k 65 205632
arcee-ai/mergekit
Tools for merging pretrained large language models.
Language:Python4.4k 50 290391
dvlab-research/LongLoRA
Code and documents of LongLoRA and LongAlpaca (ICLR 2024 Oral)
Language:Python2.6k 12 172267
FasterDecoding/Medusa
Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads
Language:Jupyter Notebook2.2k 32 85145
langchain-ai/langserve
LangServe 🦜️🏓
Language:JavaScript1.9k 20 228205
AkariAsai/self-rag
This includes the original implementation of SELF-RAG: Learning to Retrieve, Generate and Critique through self-reflection by Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, and Hannaneh Hajishirzi.
Language:Python1.7k 18 79158
Tongji-KGLLM/RAG-Survey
1.7k 31 15118
castorini/pyserini
Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.
Language:Python1.6k 18 541354
CStanKonrad/long_llama
LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA and fine-tuned with the Focused Transformer (FoT) method.
Language:Python1.4k 27 2487
gkamradt/LLMTest_NeedleInAHaystack
Doing simple retrieval from LLM models at various context lengths to measure accuracy
Language:Jupyter Notebook1.4k 15 25149
atfortes/Awesome-LLM-Reasoning
Reasoning in Large Language Models: Papers and Resources, including Chain-of-Thought, Instruction-Tuning and Multimodality.
1.4k 35 180
run-llama/llama-lab
Language:Python1.4k 17 12168
neelnanda-io/TransformerLens
A library for mechanistic interpretability of GPT-style language models
Language:Python920 13 192205
ContextualAI/gritlm
Generative Representational Instruction Tuning
Language:Jupyter Notebook519 8 4437
zhijing-jin/Causality4NLP_Papers
A reading list for papers on causality for natural language processing (NLP)
484 23 056
texttron/tevatron
Tevatron - A flexible toolkit for neural retrieval research and development.
Language:Python460 9 9490
epfml/landmark-attention
Landmark Attention: Random-Access Infinite Context Length for Transformers
Language:Python402 40 1536
mlfoundations/task_vectors
Editing Models with Task Arithmetic
Language:Python398 10 1535
google-research/distilling-step-by-step
Language:Python394 5 955
facebookresearch/dpr-scale
Scalable training for dense retrieval models.
Language:Python264 18 1324
lucidrains/soft-moe-pytorch
Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch
Language:Python232 11 88
ACL2023-Retrieval-LM/ACL2023-Retrieval-LM.github.io
https://acl2023-retrieval-lm.github.io/
Language:JavaScript151 5 113
OpenMatch/OpenMatch
An Open-Source Package for Information Retrieval
Language:Python145 4 5818
j-min/DallEval
DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generation Models (ICCV 2023)
Language:Jupyter Notebook137 7 26
xbmxb/RAG-query-rewriting
Language:Python108 2 77
ConsequentAI/fneval
Functional Benchmarks and the Reasoning Gap
Language:TeX75 1 71
roeehendel/icl_task_vectors
Language:Python64 1 617
google/belief-localization
This repository includes code for the paper "Does Localization Inform Editing? Surprising Differences in Where Knowledge Is Stored vs. Can Be Injected in Language Models."
52 3 57