chenhehong's Stars
meta-llama/llama
Inference code for Llama models
streamlit/streamlit
Streamlit — A faster way to build and share data apps.
JushBJJ/Mr.-Ranedeer-AI-Tutor
A GPT-4 AI Tutor Prompt for customizable personalized learning experiences.
QwenLM/Qwen
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
openai/tiktoken
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
RUCAIBox/LLMSurvey
The official GitHub page for the survey paper "A Survey of Large Language Models".
facebookresearch/hydra
Hydra is a framework for elegantly configuring complex applications
facebookresearch/xformers
Hackable and optimized Transformers building blocks, supporting a composable construction.
goto456/stopwords
中文常用停用词表(哈工大停用词表、百度停用词表等)
esbatmop/MNBVC
MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。
X-PLUG/MobileAgent
Mobile-Agent: The Powerful Mobile Device Operation Assistant Family
modelscope/data-juicer
Making data higher-quality, juicier, and more digestible for foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据!
LDNOOBW/List-of-Dirty-Naughty-Obscene-and-Otherwise-Bad-Words
List of Dirty, Naughty, Obscene, and Otherwise Bad Words
Zjh-819/LLMDataHub
A quick guide (especially) for trending instruction finetuning datasets
LC1332/Chat-Haruhi-Suzumiya
Chat凉宫春日, An open sourced Role-Playing chatbot Cheng Li, Ziang Leng, and others.
deepseek-ai/DeepSeek-LLM
DeepSeek LLM: Let there be answers
Anil-matcha/Awesome-GPT-Store
Custom GPT Store - A collection of major GPTS available in public
facebookresearch/cc_net
Tools to download and cleanup Common Crawl data
xxxily/hello-ai
It's not AI that takes away your job, but the people who master the use of AI tools. The most deadly attack is a dimension-reducing strike: destroying you has nothing to do with you - from "The Three-Body Problem". 中文说明: 抢走你工作的不是AI,而是掌握使用AI工具的人。 降维打击最为致命:毁灭你,与你何干《三体》
InternLM/InternLM-techreport
alibaba/Megatron-LLaMA
Best practice for training LLaMA models in Megatron-LM
lafmdp/Awesome-Papers-Autonomous-Agent
A collection of recent papers on building autonomous agent. Two topics included: RL-based / LLM-based agents.
X-PLUG/CValues
面向中文大模型价值观的评估与对齐研究
bojone/papers.cool
Cool Papers - Immersive Paper Discovery
57ing/Sensitive-word
收集的一些敏感词汇,挺全的,还细分了暴恐词库、反动词库、民生词库、色情词库、贪腐词库、其他词库等
FudanNLPLAB/CBook-150K
中文图书语料MD5链接
X-PLUG/Multi-LLM-Agent
X-PLUG/mPLUG-HalOwl
mPLUG-HalOwl: Multimodal Hallucination Evaluation and Mitigating
sotopia-lab/awesome-social-agents
A collection of works that investigate social agents, simulations and their real-world impact in text, embodied, and robotics contexts.
X-PLUG/SocialBench
RoleInteract: Evaluating the Social Interaction of Role-Playing Agents