ningpengtao-coder

ningpengtao-coder's Stars

colinhacks/zod
TypeScript-first schema validation with static type inference
Language:TypeScript34.1k 62 2k1.2k
servo/servo
Servo, the embeddable, independent, memory-safe, modular, parallel web rendering engine
Language:Rust28.5k 509 12.7k3k
taichi-dev/taichi
Productive, portable, and performant GPU programming in Python.
Language:C++25.5k 386 2.7k2.3k
huggingface/text-generation-inference
Large Language Model Text Generation Inference
Language:Python9.1k 103 1.4k1.1k
NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
Language:C++8.7k 94 2k996
zilliztech/GPTCache
Semantic cache for LLMs. Fully integrated with LangChain and llama_index.
Language:Python7.2k 58 172506
fluxcd/flux2
Open and extensible continuous delivery solution for Kubernetes. Powered by GitOps Toolkit.
Language:Go6.6k 69 1.4k609
sgl-project/sglang
SGLang is a fast serving framework for large language models and vision language models.
Language:Python6.2k 58 653524
xorbitsai/inference
Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
Language:Python5.5k 42 1.4k444
InternLM/lmdeploy
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
Language:Python4.7k 38 1.5k429
versotile-org/verso
A web browser that plays old world blues to build new world hope
Language:Rust4.7k 30 72157
cdarlint/winutils
winutils.exe hadoop.dll and hdfs.dll binaries for hadoop windows
Language:Shell1.9k 47 382.1k
noamgat/lm-format-enforcer
Enforce the output format (JSON Schema, Regex etc) of a language model
Language:Python1.6k 13 11569
datageartech/datagear
DataGear数据可视化分析平台，自由制作任何您想要的数据看板
Language:Java1.5k 34 27339
itsOwen/CyberScraper-2077
A Powerful web scraper powered by LLM | OpenAI, Gemini & Ollama
Language:Python1.4k 11 16123
hao-ai-lab/LookaheadDecoding
[ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
Language:Python1.1k 11 5867
kvcache-ai/Mooncake
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
1.1k 12 425
alibaba/Pai-Megatron-Patch
The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.
Language:Python723 10 151103
IST-DASLab/marlin
FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.
Language:Python627 15 2948
feifeibear/LLMSpeculativeSampling
Fast inference from large lauguage models via speculative decoding
Language:Python574 2 1757
hemingkx/SpeculativeDecodingPapers
📰 Must-read papers and blogs on Speculative Decoding ⚡️
473 21 421
fighting41love/DeepLearning-500-questions
深度学习500问，以问答形式对常用的概率知识、线性代数、机器学习、深度学习、计算机视觉等热点问题进行阐述，以帮助自己及有需要的读者。全书分为15个章节，近20万字。由于水平有限，书中不妥之处恳请广大读者批评指正。未完待续............ 如有意合作，联系scutjy2015@163.com 版权所有，违权必究 Tan 2018.06
Language:TeX246 13 098
lucidrains/speculative-decoding
Explorations into some recent techniques surrounding speculative decoding
Language:Python211 8 316
hemingkx/Spec-Bench
Spec-Bench: A Comprehensive Benchmark and Unified Evaluation Platform for Speculative Decoding (ACL 2024 Findings)
Language:Python187 2 1517
MoonshotAI/moonpalace
MoonPalace（月宫）是由 Moonshot AI 月之暗面提供的 API 调试工具。
Language:Go164 3 23
neuralmagic/AutoFP8
Language:Python157 14 2720
conveyordata/data-product-portal
Data product portal created by Dataminded
Language:TypeScript148 5 3927
shreyansh26/Speculative-Sampling
Implementation of Speculative Sampling as described in "Accelerating Large Language Model Decoding with Speculative Sampling" by Deepmind
Language:Python81 2 09
romsto/Speculative-Decoding
Implementation of the paper Fast Inference from Transformers via Speculative Decoding, Leviathan et al. 2023.
Language:Python26 2 10
uw-mad-dash/decoding-speculative-decoding
Language:Python9 2 11