garyfanhku's Stars
meta-llama/llama-recipes
Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama3 for WhatsApp & Messenger.
stas00/ml-engineering
Machine Learning Engineering Open Book
NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
lavague-ai/LaVague
Large Action Model framework to develop AI Web Agents
sb2nov/resume
Software developer resume in Latex
huggingface/alignment-handbook
Robust recipes to align language models with human and AI preferences
tensorchord/Awesome-LLMOps
An awesome & curated list of best LLMOps tools for developers
modelscope/agentscope
Start building LLM-empowered multi-agent applications in an easier way.
openai/weak-to-strong
S-LoRA/S-LoRA
S-LoRA: Serving Thousands of Concurrent LoRA Adapters
jiaweizzhao/GaLore
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
atfortes/Awesome-LLM-Reasoning
Reasoning in Large Language Models: Papers and Resources, including Chain-of-Thought, Instruction-Tuning and Multimodality.
hao-ai-lab/LookaheadDecoding
flashinfer-ai/flashinfer
FlashInfer: Kernel Library for LLM Serving
jxmorris12/vec2text
utilities for decoding deep representations (like sentence embeddings) back to text
ContextualAI/gritlm
Generative Representational Instruction Tuning
persimmon-ai-labs/adept-inference
Inference code for Persimmon-8B
likenneth/honest_llama
Inference-Time Intervention: Eliciting Truthful Answers from a Language Model
voidism/DoLa
Official implementation for the paper "DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models"
zeux/calm
CUDA/Metal accelerated language model inference
ZIYU-DEEP/Awesome-Information-Bottleneck
This is a curated list for Information Bottleneck Principle, in memory of Professor Naftali Tishby.
OpenBMB/BMPrinciples
A collection of phenomenons observed during the scaling of big foundation models, which may be developed into consensus, principles, or laws in the future
for-ai/parameter-efficient-moe
usyd-fsalab/fp6_llm
An efficient GPU support for LLM inference with x-bit quantization (e.g. FP6,FP5).
rvorias/ind_knn_ad
Industrial knn-based anomaly detection for images. Visit streamlit link to check out the demo.
davidbau/baukit
Open-All-Scale-Causal-Engine/OpenASCE
OpenASCE (Open All-Scale Casual Engine) is a Python package for end-to-end large-scale causal learning. It provides causal discovery, causal effect estimation and attribution algorithms all in one package.
NVIDIA/workbench-example-hybrid-rag
An NVIDIA AI Workbench example project for Retrieval Augmented Generation (RAG)
pliang279/FactorCL
[NeurIPS 2023] Factorized Contrastive Learning: Going Beyond Multi-view Redundancy
adrienbrault/hermes2pro-proxy
Use Hermes-2-Pro-Mistral-7B function calling with your OpenAI API compatible code.