cslydia's Stars
Significant-Gravitas/AutoGPT
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
microsoft/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
meta-llama/llama3
The official Meta Llama 3 GitHub site
ymcui/Chinese-LLaMA-Alpaca
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
QwenLM/Qwen
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
opendatalab/MinerU
A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具,支持PDF/网页/多格式电子书提取。
THUDM/ChatGLM3
ChatGLM3 series: Open Bilingual Chat LLMs | 开源双语对话语言模型
openai/tiktoken
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
OpenMOSS/MOSS
An open-source tool-augmented conversational language model from Fudan University
cleanlab/cleanlab
The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
THUDM/GLM-130B
GLM-130B: An Open Bilingual Pre-Trained Model (ICLR 2023)
baichuan-inc/Baichuan-7B
A large-scale 7B pretraining language model developed by BaiChuan-Inc.
google/BIG-bench
Beyond the Imitation Game collaborative benchmark for measuring and extrapolating the capabilities of language models
FranxYao/chain-of-thought-hub
Benchmarking large language models' complex reasoning ability with chain-of-thought prompting
stanford-crfm/helm
Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110). This framework is also used to evaluate text-to-image models in HEIM (https://arxiv.org/abs/2311.04287) and vision-language models in VHELM (https://arxiv.org/abs/2410.07112).
hkust-nlp/ceval
Official github repo for C-Eval, a Chinese evaluation suite for foundation models [NeurIPS 2023]
microsoft/CodeXGLUE
CodeXGLUE
microsoft/mup
maximal update parametrization (µP)
bigscience-workshop/Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
hendrycks/test
Measuring Massive Multitask Language Understanding | ICLR 2021
Duxiaoman-DI/XuanYuan
轩辕:度小满中文金融对话大模型
ORDINAND/The-Art-of-Asking-ChatGPT-for-High-Quality-Answers-A-complete-Guide-to-Prompt-Engineering-Technique
ChatGPT提问技巧
bigscience-workshop/bigscience
Central place for the engineering/scaling WG: documentation, SLURM scripts and logs, compute environment and data.
ZhuiyiTechnology/roformer
Rotary Transformer
bigcode-project/bigcode-dataset
thu-coai/COLDataset
The official repository of the paper: COLD: A Benchmark for Chinese Offensive Language Detection
thaumstrial/FinetuneGLMWithPeft
Simple implementation of using lora form the peft library to fine-tune the chatglm-6b
OpenLMLab/ChatZoo
Light local website for displaying performances from different chat models.
dqxiu/KAssess
RyanBurnell/revealing-LLM-capabilities
Code and data for the paper Revealing the structure of language model capabilities