minhopark-neubla's Stars
Taytay/slack-langchain
Slackbot that uses Langchain to integrate with LLMs
ModelTC/lightllm
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
microsoft/autogen
A programming framework for agentic AI 🤖
mit-han-lab/streaming-llm
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
mistralai/mistral-inference
Official inference library for Mistral models
nlpxucan/WizardLM
LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath
f/awesome-chatgpt-prompts
This repo includes ChatGPT prompt curation to use ChatGPT better.
msoedov/langcorn
⛓️ Serving LangChain LLM apps and agents automagically with FastApi. LLMops
FlagOpen/FlagEmbedding
Retrieval and Retrieval-augmented LLMs
OpenPipe/OpenPipe
Turn expensive prompts into cheap fine-tuned models
Noeda/rllama
Rust+OpenCL+AVX2 implementation of LLaMA inference code
rustformers/llm
[Unmaintained, see README] An ecosystem of Rust libraries for working with large language models
modularml/mojo
The Mojo Programming Language
horseee/Awesome-Efficient-LLM
A curated list for Efficient Large Language Models
NomaDamas/awesome-korean-llm
Awesome list of Korean Large Language Models.
pola-rs/polars
Dataframes powered by a multithreaded, vectorized query engine, written in Rust
pantsbuild/pants
The Pants Build System
cwpearson/nvidia-performance-tools
Instructions, Docker images, and examples for Nsight Compute and Nsight Systems
cli99/llm-analysis
Latency and Memory Analysis of Transformer Models for Training and Inference
microsoft/unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
karpathy/llama2.c
Inference Llama 2 in one file of pure C
dabeaz-course/python-mastery
Advanced Python Mastery (course by @dabeaz)
huggingface/text-generation-inference
Large Language Model Text Generation Inference
NVIDIA/TransformerEngine
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.
pybind/cmake_example
Example pybind11 module built with a CMake-based build system
pybind/python_example
Example pybind11 module built with a Python-based build system
microsoft/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Beomi/KoAlpaca
KoAlpaca: 한국어 명령어를 이해하는 오픈소스 언어모델
lm-sys/FastChat
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
cjaques/pybind_examples
Self-contained example on the use of Pybind11