nghuyong's Stars
krahets/hello-algo
《Hello 算法》:动画图解、一键运行的数据结构与算法教程。支持 Python, Java, C++, C, C#, JS, Go, Swift, Rust, Ruby, Kotlin, TS, Dart 代码。简体版和繁体版同步更新,English version ongoing
hpcaitech/ColossalAI
Making large AI models cheaper, faster and more accessible
bentoml/OpenLLM
Run any open-source LLMs, such as Llama, Mistral, as OpenAI compatible API endpoint in the cloud.
mistralai/mistral-inference
Official inference library for Mistral models
karpathy/minbpe
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
leptonai/search_with_lepton
Building a quick conversation-based search demo with Lepton AI.
01-ai/Yi
A series of large language models trained from scratch by developers @01-ai
deepseek-ai/DeepSeek-Coder
DeepSeek Coder: Let the Code Write Itself
gee1k/uPic
📤uPic is a native, powerful, beautiful and simple picture and file upload tool for macOS.
DLLXW/baby-llama2-chinese
用于从头预训练+SFT一个小参数量的中文LLaMa2的仓库;24G单卡即可运行得到一个具备简单中文问答能力的chat-llama2.
TigerResearch/TigerBot
TigerBot: A multi-language multi-task LLM
microsoft/Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
microsoft/DeepSpeed-MII
MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
deepseek-ai/DeepSeek-LLM
DeepSeek LLM: Let there be answers
bigscience-workshop/Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
SkyworkAI/Skywork
Skywork series models are pre-trained on 3.2TB of high-quality multilingual (mainly Chinese and English) and code data. We have open-sourced the model, training data, evaluation data, evaluation methods, etc. 天工系列模型在3.2TB高质量多语言和代码数据上进行预训练。我们开源了模型参数,训练数据,评估数据,评估方法。
bigscience-workshop/bigscience
Central place for the engineering/scaling WG: documentation, SLURM scripts and logs, compute environment and data.
wangyuxinwhy/uniem
unified embedding model
haonan-li/CMMLU
CMMLU: Measuring massive multitask language understanding in Chinese
IEIT-Yuan/Yuan-2.0
Yuan 2.0 Large Language Model
xverse-ai/XVERSE-13B
XVERSE-13B: A multilingual large language model developed by XVERSE Technology Inc.
yangjianxin1/LLMPruner
twang2218/vocab-coverage
语言模型中文认知能力分析
FudanNLPLAB/CBook-150K
中文图书语料MD5链接
liziniu/ReMax
Code for Paper (ReMax: A Simple, Efficient and Effective Reinforcement Learning Method for Aligning Large Language Models)
sh0416/llama-classification
Text classification with Foundation Language Model LLaMA
leogao2/lm_dataformat
xsysigma/TencentLLMEval
TencentLLMEval is a comprehensive and extensive benchmark for artificial evaluation of large models that includes task trees, standards, data verification methods, and more.
gmftbyGMFTBY/Rep-Dropout
[NeurIPS 2023] Repetition In Repetition Out: Towards Understanding Neural Text Degeneration from the Data Perspective