garyfanhku

HK

garyfanhku's Stars

meta-llama/llama-recipes
Scripts for fine-tuning Meta Llama with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama for WhatsApp & Messenger.
Language:Jupyter Notebook11.7k 95 3321.7k
stas00/ml-engineering
Machine Learning Engineering Open Book
Language:Python11.1k 113 23662
dataelement/bisheng
BISHENG is an open LLM devops platform for next generation Enterprise AI applications. Powerful and comprehensive features include: GenAI workflow, RAG, Agent, Unified model management, Evaluation, SFT, Dataset Management, Enterprise-level System Management, Observability and more.
Language:Python8.6k 906 1501.6k
NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
Language:C++8.3k 87 1.8k919
lavague-ai/LaVague
Large Action Model framework to develop AI Web Agents
Language:Python5.3k 53 279481
sb2nov/resume
Software developer resume in Latex
Language:TeX5.2k 42 331.4k
modelscope/agentscope
Start building LLM-empowered multi-agent applications in an easier way.
Language:Python4.9k 30 132300
tensorchord/Awesome-LLMOps
An awesome & curated list of best LLMOps tools for developers
Language:Shell3.8k 66 8360
openai/weak-to-strong
Language:Python2.5k 32 18303
S-LoRA/S-LoRA
S-LoRA: Serving Thousands of Concurrent LoRA Adapters
Language:Python1.7k 24 3889
atfortes/Awesome-LLM-Reasoning
Reasoning in Large Language Models: Papers and Resources, including Chain-of-Thought and OpenAI o1 🍓
1.6k 35 290
jiaweizzhao/GaLore
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Language:Python1.4k 18 52141
SylphAI-Inc/AdalFlow
AdalFlow: The library to build & auto-optimize any LLM tasks.
Language:Python1.4k 16 28119
flashinfer-ai/flashinfer
FlashInfer: Kernel Library for LLM Serving
Language:Cuda1.2k 16 103108
hao-ai-lab/LookaheadDecoding
[ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
Language:Python1.1k 11 5564
OpenAdaptAI/OpenAdapt
Open Source Generative Process Automation (i.e. Generative RPA). AI-First Process Automation with Large ([Language (LLMs) / Action (LAMs) / Multimodal (LMMs)] / Visual Language (VLMs)) Models
Language:Python876 8 471115
jxmorris12/vec2text
utilities for decoding deep representations (like sentence embeddings) back to text
Language:Python703 10 5577
ContextualAI/gritlm
Generative Representational Instruction Tuning
Language:Jupyter Notebook536 8 4738
UbiquitousLearning/mllm
Fast Multimodal LLM on Mobile Devices
Language:C++415 14 3751
zeux/calm
CUDA/Metal accelerated language model inference
Language:C367 9 013
ZIYU-DEEP/Awesome-Information-Bottleneck
This is a curated list for Information Bottleneck Principle, in memory of Professor Naftali Tishby.
313 12 041
for-ai/parameter-efficient-moe
Language:Python242 17 316
NVIDIA/workbench-example-hybrid-rag
An NVIDIA AI Workbench example project for Retrieval Augmented Generation (RAG)
Language:Python208 6 11515
usyd-fsalab/fp6_llm
An efficient GPU support for LLM inference with x-bit quantization (e.g. FP6,FP5).
Language:Cuda177 5 915
rvorias/ind_knn_ad
Vanilla torch and timm industrial knn-based anomaly detection for images.
Language:Python147 4 3249
StonyBrookNLP/musique
Repository for MuSiQue: Multi-hop Questions via Single-hop Question Composition, TACL 2022
Language:Python88 11 98
Open-All-Scale-Causal-Engine/OpenASCE
OpenASCE (Open All-Scale Casual Engine) is a Python package for end-to-end large-scale causal learning. It provides causal discovery, causal effect estimation and attribution algorithms all in one package.
Language:Python63 8 18
pliang279/FactorCL
[NeurIPS 2023] Factorized Contrastive Learning: Going Beyond Multi-view Redundancy
Language:Jupyter Notebook55 3 45
NSTiwari/Llama3-on-Mobile
This repository is an implementation of quantizing and converting the Llama3-8B-Instruct model weights and deploying it on Android for on-device inference.
Language:Makefile47 1 11
adrienbrault/hermes2pro-proxy
Use Hermes-2-Pro-Mistral-7B function calling with your OpenAI API compatible code.
Language:PHP15 2 02