wu-yy's Stars
gin-gonic/gin
Gin is a HTTP web framework written in Go (Golang). It features a Martini-like API with much better performance -- up to 40 times faster. If you need smashing performance, get yourself some Gin.
ggerganov/llama.cpp
LLM inference in C/C++
chatchat-space/Langchain-Chatchat
Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and Llama) RAG and Agent app with langchain
FlowiseAI/Flowise
Drag & drop UI to build your customized LLM flow
milvus-io/milvus
A cloud-native vector database, storage for next generation AI applications
iperov/DeepFaceLive
Real-time face swap for PC streaming or video calls
uber-go/zap
Blazing fast, structured, leveled logging in Go.
stanfordnlp/dspy
DSPy: The framework for programming—not prompting—language models
meta-llama/llama-recipes
Scripts for fine-tuning Meta Llama with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama for WhatsApp & Messenger.
LlamaFamily/Llama-Chinese
Llama中文社区,Llama3在线体验和微调模型已开放,实时汇总最新Llama3学习资料,已将所有代码更新适配Llama3,构建最好的中文Llama大模型,完全开源可商用
eosphoros-ai/DB-GPT
AI Native Data App Development framework with AWEL(Agentic Workflow Expression Language) and Agents
OpenMOSS/MOSS
An open-source tool-augmented conversational language model from Fudan University
google/sentencepiece
Unsupervised text tokenizer for Neural Network-based text generation.
artidoro/qlora
QLoRA: Efficient Finetuning of Quantized LLMs
huggingface/tokenizers
💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
togethercomputer/OpenChatKit
openlm-research/open_llama
OpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset
ztxz16/fastllm
纯c++的全平台llm加速库,支持python调用,chatglm-6B级模型单卡可达10000+token / s,支持glm, llama, moss基座,手机端流畅运行
qwopqwop200/GPTQ-for-LLaMa
4 bits quantization of LLaMA using GPTQ
Doubiiu/DynamiCrafter
[ECCV 2024, Oral] DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors
AetherCortex/Llama-X
Open Academic Research on Improving LLaMA to SOTA LLM
CStanKonrad/long_llama
LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA and fine-tuned with the Focused Transformer (FoT) method.
tloen/llama-int8
Quantized inference code for LLaMA models
Didnelpsun/Math
考研数学,数学一,包括高等数学、线性代数、概率统计
SqueezeAILab/SqueezeLLM
[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization
china-ai-law-challenge/CAIL2019
triton-inference-server/pytorch_backend
The Triton backend for the PyTorch TorchScript models.
FittenTech/OpenLLaMA-Chinese
OpenLLaMA-Chinese, a permissively licensed open source instruction-following models based on OpenLLaMA
GemmaLab/GemmaChinese
Chinese Community for Google Gemma LLM
void-main/fastertransformer_backend