haonan-li

NLP Postdoctoral Fellow at MBZUAI.

Abu Dhabi, UAE

haonan-li's Stars

chinese-poetry/chinese-poetry
The most comprehensive database of Chinese poetry 🧶最全中华古诗词数据库, 唐宋两朝近一万四千古诗人, 接近5.5万首唐诗加26万宋诗. 两宋时期1564位词人，21050首词。
Language:JavaScript47.9k 1.2k 2039.6k
lm-sys/FastChat
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
Language:Python36.4k 349 1.8k4.5k
microsoft/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Language:Python34.7k 342 2.7k4k
tatsu-lab/stanford_alpaca
Code and documentation to train Stanford's Alpaca models, and generate the data.
Language:Python29.3k 339 2674k
tloen/alpaca-lora
Instruct-tune LLaMA on consumer hardware
Language:Jupyter Notebook18.5k 152 4692.2k
ymcui/Chinese-LLaMA-Alpaca
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
Language:Python18.2k 183 7301.9k
huggingface/peft
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
Language:Python15.8k 104 1k1.5k
TransformerOptimus/SuperAGI
<⚡️> SuperAGI - A dev-first open source autonomous AI agent framework. Enabling developers to build, manage & run useful autonomous agents quickly and reliably.
Language:Python15.3k 173 4051.8k
dair-ai/ML-Papers-of-the-Week
🔥Highlighting the top ML papers every week.
9.9k 839 3572
EleutherAI/lm-evaluation-harness
A framework for few-shot evaluation of language models.
Language:Python6.4k 36 1k1.7k
baichuan-inc/Baichuan-7B
A large-scale 7B pretraining language model developed by BaiChuan-Inc.
Language:Python5.7k 66 129506
HazyResearch/flash-attention
Fast and memory-efficient exact attention
Language:Python4.3k 71 268365
hiyouga/ChatGLM-Efficient-Tuning
Fine-tuning ChatGLM-6B with PEFT | 基于 PEFT 的高效 ChatGLM 微调
Language:Python3.7k 32 374471
yuchenlin/rebiber
A simple tool to update bib entries with their official information (e.g., DBLP or the ACL anthology).
Language:Python2.6k 15 29156
openai/human-eval
Code for the paper "Evaluating Large Language Models Trained on Code"
Language:Python2.3k 133 35330
gururise/AlpacaDataCleaned
Alpaca dataset from Stanford, cleaned and curated
Language:Python1.5k 27 25146
SJTU-LIT/ceval
Official github repo for C-Eval, a Chinese evaluation suite for foundation models
Language:Python1k 15 5153
Cerebras/modelzoo
Language:Python908 25 17128
primeqa/primeqa
The prime repository for state-of-the-art Multilingual Question Answering research and development.
Language:Python724 28 32757
haonan-li/CMMLU
CMMLU: Measuring massive multitask language understanding in Chinese
Language:Python668 11 3649
peci1/nvidia-htop
A tool for enriching the output of nvidia-smi.
Language:Python531 10 1458
ryanzhumich/Contrastive-Learning-NLP-Papers
Paper List for Contrastive Learning for Natural Language Processing
530 13 259
google-research-datasets/tydiqa
TyDi QA contains 200k human-annotated question-answer pairs in 11 Typologically Diverse languages, written without seeing the answer and without the use of translation, and is designed for the training and evaluation of automatic question answering systems. This repository provides evaluation code and a baseline system for the dataset.
Language:Python289 10 842
deepmind/xquad
165 13 431
apple/ml-mkqa
We introduce MKQA, an open-domain question answering evaluation set comprising 10k question-answer pairs aligned across 26 typologically diverse languages (260k question-answer pairs in total). The goal of this dataset is to provide a challenging benchmark for question answering quality across a wide set of languages. Please refer to our paper for details, MKQA: A Linguistically Diverse Benchmark for Multilingual Open Domain Question Answering
Language:Python162 12 121
zwhe99/MAPS-mt
[TACL 2024] MAPS enables LLMs🤖 to mimic the human😁 translation process.
Language:Python132 8 75
mbzuai-nlp/bactrian-x
A Multilingual Replicable Instruction-Following Model
Language:Python91 9 43
abrazinskas/SelSum
Abstractive opinion summarization system (SelSum) and the largest dataset of Amazon product summaries (AmaSum). EMNLP 2021 conference paper.
Language:Python44 5 03
ChunhuaLiu596/WAX
The respository describing a novel datasets for word association explanations
Language:Python10 3 02
haonan-li/QA-Datasets
A summary of QA datasets.
1 1 00

haonan-li

haonan-li's Stars

chinese-poetry/chinese-poetry

lm-sys/FastChat

microsoft/DeepSpeed

tatsu-lab/stanford_alpaca

tloen/alpaca-lora

ymcui/Chinese-LLaMA-Alpaca

huggingface/peft

TransformerOptimus/SuperAGI

dair-ai/ML-Papers-of-the-Week

EleutherAI/lm-evaluation-harness

baichuan-inc/Baichuan-7B

HazyResearch/flash-attention

hiyouga/ChatGLM-Efficient-Tuning

yuchenlin/rebiber

openai/human-eval

gururise/AlpacaDataCleaned

SJTU-LIT/ceval

Cerebras/modelzoo

primeqa/primeqa

haonan-li/CMMLU

peci1/nvidia-htop

ryanzhumich/Contrastive-Learning-NLP-Papers

google-research-datasets/tydiqa

deepmind/xquad

apple/ml-mkqa

zwhe99/MAPS-mt

mbzuai-nlp/bactrian-x

abrazinskas/SelSum

ChunhuaLiu596/WAX

haonan-li/QA-Datasets