wwngh1233

I'm Qu Chen, a student reseaching on computer science

wwngh1233's Stars

THUDM/ChatGLM2-6B
ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型
Language:Python15.7k 132 6151.9k
HqWu-HITCS/Awesome-Chinese-LLM
整理开源的中文大语言模型，以规模较小、可私有化部署、训练成本较低的模型为主，包括底座模型，垂直领域微调及应用，数据集与教程等。
15.4k 198 261.4k
openlm-research/open_llama
OpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset
7.4k 121 91375
InternLM/InternLM
Official release of InternLM2.5 base and chat models. 1M context support
Language:Python6.3k 55 331443
baichuan-inc/Baichuan-7B
A large-scale 7B pretraining language model developed by BaiChuan-Inc.
Language:Python5.7k 67 129507
lyogavin/airllm
AirLLM 70B inference with single 4GB GPU
Language:Jupyter Notebook4.6k 118 166364
open-compass/opencompass
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
Language:Python3.9k 24 529410
jeinlee1991/chinese-llm-benchmark
中文大模型能力评测榜单：目前已囊括115个大模型，覆盖chatgpt、gpt4o、百度文心一言、阿里通义千问、讯飞星火、商汤senseChat、minimax等商用模型，以及百川、qwen2、glm4、yi、书生internLM2、llama3等开源大模型，多维度能力评测。不仅提供能力评分排行榜，也提供所有模型的原始输出结果！
2.6k 33 46122
tatsu-lab/alpaca_eval
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
Language:Jupyter Notebook1.5k 7 142234
Xwin-LM/Xwin-LM
Xwin-LM: Powerful, Stable, and Reproducible LLM Alignment
Language:Python1k 37 2041
GAIR-NLP/factool
FacTool: Factuality Detection in Generative AI
Language:Python815 10 2861
haonan-li/CMMLU
CMMLU: Measuring massive multitask language understanding in Chinese
Language:Python683 11 3652
epfLLM/Megatron-LLM
distributed trainer for LLMs
Language:Python533 18 5976
OpenLMLab/GAOKAO-Bench
GAOKAO-Bench is an evaluation framework that utilizes GAOKAO questions as a dataset to evaluate large language models.
Language:Python527 4 2437
the-crypt-keeper/can-ai-code
Self-evaluating interview for AI coders
Language:Python525 11 22130
GAIR-NLP/abel
SOTA Math Opensource LLM
Language:Python306 10 1317
tianyi-lab/Cherry_LLM
[NAACL'24] Self-data filtering of LLM instruction-tuning data using a novel perplexity-based difficulty score, without using any other models
Language:Python291 3 2420
FlagOpen/FlagEval
FlagEval is an evaluation toolkit for AI large foundation models.
Language:Python290 13 3128
kwai/KwaiYii
217 8 124
sufengniu/RefGPT
156 3 021
getcursor/eval
Language:Python109 6 016
OpenMOSS/HalluQA
Dataset and evaluation script for "Evaluating Hallucinations in Chinese Large Language Models"
Language:Python109 5 04
360CVGroup/SEEChat
Multimodal chatbot with computer vision capabilities integrated
Language:Python98 3 69
MikeGu721/XiezhiBenchmark
Language:Python90 1 94
Felixgithub2017/MMCU
MEASURING MASSIVE MULTITASK CHINESE UNDERSTANDING
Language:Python87 2 1112
my-other-github-account/llm-humaneval-benchmarks
Language:Jupyter Notebook86 6 15
qcri/LLMeBench
Benchmarking Large Language Models
Language:Python79 13 3516
HillZhang1999/NaSGEC
Code & Data for our Paper "NaSGEC: Multi-Domain Chinese Grammatical Error Correction for Native Speaker Texts" (ACL 2023 Findings)
Language:Python75 1 206
hongwang600/FLAIR
Language:Python56 2 07
h2oai/h2o-LLM-eval
Large-language Model Evaluation framework with Elo Leaderboard and A-B testing
Language:Jupyter Notebook49 39 61

wwngh1233

wwngh1233's Stars

THUDM/ChatGLM2-6B

HqWu-HITCS/Awesome-Chinese-LLM

openlm-research/open_llama

InternLM/InternLM

baichuan-inc/Baichuan-7B

lyogavin/airllm

open-compass/opencompass

jeinlee1991/chinese-llm-benchmark

tatsu-lab/alpaca_eval

Xwin-LM/Xwin-LM

GAIR-NLP/factool

haonan-li/CMMLU

epfLLM/Megatron-LLM

OpenLMLab/GAOKAO-Bench

the-crypt-keeper/can-ai-code

GAIR-NLP/abel

tianyi-lab/Cherry_LLM

FlagOpen/FlagEval

kwai/KwaiYii

sufengniu/RefGPT

getcursor/eval

OpenMOSS/HalluQA

360CVGroup/SEEChat

MikeGu721/XiezhiBenchmark

Felixgithub2017/MMCU

my-other-github-account/llm-humaneval-benchmarks

qcri/LLMeBench

HillZhang1999/NaSGEC

hongwang600/FLAIR

h2oai/h2o-LLM-eval