yty3805595's Stars
meta-llama/llama3
The official Meta Llama 3 GitHub site
unslothai/unsloth
Finetune Llama 3.1, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory
QwenLM/Qwen
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
openai/tiktoken
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
netease-youdao/QAnything
Question and Answer based on Anything.
SJTU-IPADS/PowerInfer
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
QwenLM/Qwen2
Qwen2 is the large language model series developed by Qwen team, Alibaba Cloud.
ymcui/Chinese-LLaMA-Alpaca-2
中文LLaMA-2 & Alpaca-2大模型二期项目 + 64K超长上下文模型 (Chinese LLaMA-2 & Alpaca-2 LLMs with 64K long context models)
FlagOpen/FlagEmbedding
Retrieval and Retrieval-augmented LLMs
huggingface/alignment-handbook
Robust recipes to align language models with human and AI preferences
modelscope/swift
ms-swift: Use PEFT or Full-parameter to finetune 300+ LLMs or 50+ MLLMs. (Qwen2, GLM4v, Internlm2.5, Yi, Llama3.1, Llava-Video, Internvl2, MiniCPM-V-2.6, Deepseek, Baichuan2, Gemma2, Phi3-Vision, ...)
Tongji-KGLLM/RAG-Survey
netease-youdao/BCEmbedding
Netease Youdao's open-source embedding and reranker models for RAG products.
jquesnelle/yarn
YaRN: Efficient Context Window Extension of Large Language Models
peremartra/Large-Language-Model-Notebooks-Course
Practical course about Large Language Models.
thunlp/WebCPM
Official codes for ACL 2023 paper "WebCPM: Interactive Web Search for Chinese Long-form Question Answering"
pjlab-sys4nlp/llama-moe
⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training
ymcui/Chinese-Mixtral
中文Mixtral混合专家大模型(Chinese Mixtral MoE LLMs)
AviSoori1x/makeMoE
From scratch implementation of a sparse mixture of experts language model inspired by Andrej Karpathy's makemore :)
zhuzilin/ring-flash-attention
Ring attention implementation with flash attention
BeachWang/DAIL-SQL
A efficient and effective few-shot NL2SQL method on GPT-4.
lucidrains/local-attention
An implementation of local windowed attention for language modeling
OpenLMLab/LEval
[ACL'24 Outstanding] Data and code for L-Eval, a comprehensive long context language models evaluation benchmark
X-PLUG/ChatPLUG
A Chinese Open-Domain Dialogue System
lucidrains/st-moe-pytorch
Implementation of ST-Moe, the latest incarnation of MoE after years of research at Brain, in Pytorch
shmsw25/FActScore
A package to evaluate factuality of long-form generation. Original implementation of our EMNLP 2023 paper "FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation"
AutoLLM/AutoAgents
Complex question answering in LLMs with enhanced reasoning and information-seeking capabilities.
asahi417/lmppl
Calculate perplexity on a text with pre-trained language models. Support MLM (eg. DeBERTa), recurrent LM (eg. GPT3), and encoder-decoder LM (eg. Flan-T5).
fkodom/grouped-query-attention-pytorch
(Unofficial) PyTorch implementation of grouped-query attention (GQA) from "GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints" (https://arxiv.org/pdf/2305.13245.pdf)
luchangli03/export_llama_to_onnx
export llama to onnx