garyfanhku's Stars
meta-llama/llama-recipes
Scripts for fine-tuning Meta Llama with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama for WhatsApp & Messenger.
stas00/ml-engineering
Machine Learning Engineering Open Book
dataelement/bisheng
BISHENG is an open LLM devops platform for next generation Enterprise AI applications. Powerful and comprehensive features include: GenAI workflow, RAG, Agent, Unified model management, Evaluation, SFT, Dataset Management, Enterprise-level System Management, Observability and more.
NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
lavague-ai/LaVague
Large Action Model framework to develop AI Web Agents
sb2nov/resume
Software developer resume in Latex
modelscope/agentscope
Start building LLM-empowered multi-agent applications in an easier way.
tensorchord/Awesome-LLMOps
An awesome & curated list of best LLMOps tools for developers
openai/weak-to-strong
S-LoRA/S-LoRA
S-LoRA: Serving Thousands of Concurrent LoRA Adapters
atfortes/Awesome-LLM-Reasoning
Reasoning in Large Language Models: Papers and Resources, including Chain-of-Thought and OpenAI o1 🍓
jiaweizzhao/GaLore
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
SylphAI-Inc/AdalFlow
AdalFlow: The library to build & auto-optimize any LLM tasks.
flashinfer-ai/flashinfer
FlashInfer: Kernel Library for LLM Serving
hao-ai-lab/LookaheadDecoding
[ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
OpenAdaptAI/OpenAdapt
Open Source Generative Process Automation (i.e. Generative RPA). AI-First Process Automation with Large ([Language (LLMs) / Action (LAMs) / Multimodal (LMMs)] / Visual Language (VLMs)) Models
jxmorris12/vec2text
utilities for decoding deep representations (like sentence embeddings) back to text
ContextualAI/gritlm
Generative Representational Instruction Tuning
UbiquitousLearning/mllm
Fast Multimodal LLM on Mobile Devices
zeux/calm
CUDA/Metal accelerated language model inference
ZIYU-DEEP/Awesome-Information-Bottleneck
This is a curated list for Information Bottleneck Principle, in memory of Professor Naftali Tishby.
for-ai/parameter-efficient-moe
NVIDIA/workbench-example-hybrid-rag
An NVIDIA AI Workbench example project for Retrieval Augmented Generation (RAG)
usyd-fsalab/fp6_llm
An efficient GPU support for LLM inference with x-bit quantization (e.g. FP6,FP5).
rvorias/ind_knn_ad
Vanilla torch and timm industrial knn-based anomaly detection for images.
StonyBrookNLP/musique
Repository for MuSiQue: Multi-hop Questions via Single-hop Question Composition, TACL 2022
Open-All-Scale-Causal-Engine/OpenASCE
OpenASCE (Open All-Scale Casual Engine) is a Python package for end-to-end large-scale causal learning. It provides causal discovery, causal effect estimation and attribution algorithms all in one package.
pliang279/FactorCL
[NeurIPS 2023] Factorized Contrastive Learning: Going Beyond Multi-view Redundancy
NSTiwari/Llama3-on-Mobile
This repository is an implementation of quantizing and converting the Llama3-8B-Instruct model weights and deploying it on Android for on-device inference.
adrienbrault/hermes2pro-proxy
Use Hermes-2-Pro-Mistral-7B function calling with your OpenAI API compatible code.