thsno02's Stars
openai/openai-cookbook
Examples and guides for using the OpenAI API
labmlai/annotated_deep_learning_paper_implementations
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠
lobehub/lobe-chat
🤯 Lobe Chat - an open-source, modern-design AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Ollama / Qwen / DeepSeek), Knowledge Base (file upload / knowledge management / RAG ), Multi-Modals (Vision/TTS/Plugins/Artifacts). One-click FREE deployment of your private ChatGPT/ Claude application.
karpathy/nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
lm-sys/FastChat
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
microsoft/AI-For-Beginners
12 Weeks, 24 Lessons, AI for All!
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
openai/gpt-2
Code for the paper "Language Models are Unsupervised Multitask Learners"
karpathy/minGPT
A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
HqWu-HITCS/Awesome-Chinese-LLM
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
openai/tiktoken
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
huggingface/trl
Train transformer language models with reinforcement learning.
deezertidal/shadowrocket-rules
小火箭 shadowrocket 配置文件 模块 脚本 module sgmodule 图文教程 规则 分流 破解 解锁
togethercomputer/RedPajama-Data
The RedPajama-Data repository contains code for preparing large datasets for training large language models.
h2oai/h2o-llmstudio
H2O LLM Studio - a framework and no-code GUI for fine-tuning LLMs. Documentation: https://docs.h2o.ai/h2o-llmstudio/
esbatmop/MNBVC
MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。
net4people/bbs
Forum for discussing Internet censorship circumvention
Zjh-819/LLMDataHub
A quick guide (especially) for trending instruction finetuning datasets
openai/weak-to-strong
Ucas-HaoranWei/Vary
[ECCV 2024] Official code implementation of Vary: Scaling Up the Vision Vocabulary of Large Vision Language Models.
fwwdn/sensitive-stop-words
互联网常用敏感词、停止词词库
OpenLMLab/MOSS-RLHF
Secrets of RLHF in Large Language Models Part I: PPO
XiongjieDai/GPU-Benchmarks-on-LLM-Inference
Multiple NVIDIA GPUs or Apple Silicon for Large Language Model Inference?
SkyworkAI/Skywork
Skywork series models are pre-trained on 3.2TB of high-quality multilingual (mainly Chinese and English) and code data. We have open-sourced the model, training data, evaluation data, evaluation methods, etc. 天工系列模型在3.2TB高质量多语言和代码数据上进行预训练。我们开源了模型参数,训练数据,评估数据,评估方法。
corca-ai/awesome-llm-security
A curation of awesome tools, documents and projects about LLM Security.
AILab-CVC/UniRepLKNet
[CVPR'24] UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
DUOMO/TransGPT
jkiss/sensitive-words
互联网常用敏感词库
ari-holtzman/degen
Official Repository for "The Curious Case of Neural Text Degeneration"
lxs602/Chinese-Mandarin-Dictionaries
中文词典 / 中文詞典。Chinese / Chinese-English dictionaries.