Emanual20

3rd-year M.S. student of Gaoling School of Artificial Intelligence, Renmin University of China @RUC-GSAI

Renmin University of ChinaHaidian, Beijing

Emanual20's Stars

315386775/DeepLearing-Interview-Awesome-2024
AIGC-interview/CV-interview/LLMs-interview面试问题与答案集合仓，同时包含工作和科研过程中的新想法、新问题、新资源与新项目
1.3k122
mlfoundations/open_clip
An open source implementation of CLIP.
Language:Python9.3k924
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python23.4k3.3k
mlfoundations/MINT-1T
MINT-1T: A one trillion token multimodal interleaved dataset.
971
NLP2CT/LLM-generated-Text-Detection
A survey and reflection on the latest research breakthroughs in LLM-generated Text detection, including data, detectors, metrics, current issues and future directions.
15011
THUDM/GLM-4
GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
Language:Python3.9k282
lyy1994/awesome-data-contamination
The Paper List on Data Contamination for Large Language Models Evaluation.
401
openai/gpt-2-output-dataset
Dataset of GPT-2 outputs for research in detection, biases, and more
Language:Python1.9k548
RUCAIBox/LLMBox
A comprehensive library for implementing LLMs, including a unified training pipeline and comprehensive model evaluation.
Language:Python50766
liyucheng09/LatestEval
Latest Evaluation Toolkit (LatestEval). Assessing the language models with latest, uncontaminated materials.
Language:Python17
acl-org/acl-style-files
Official style files for papers submitted to venues of the Association for Computational Linguistics
Language:TeX637168
srush/Tensor-Puzzles
Solve puzzles. Improve your pytorch.
Language:Jupyter Notebook2.9k241
DjangoPeng/LLM-quickstart
Quick Start for Large Language Models (Theoretical Learning and Practical Fine-tuning) 大语言模型快速入门（理论学习与微调实战）
Language:Jupyter Notebook395308
liyucheng09/llm-compressive
Longitudinal Evaluation of LLMs via Data Compression
Language:Python24
meta-llama/llama3
The official Meta Llama 3 GitHub site
Language:Python23.5k2.5k
wangshusen/RecommenderSystem
2k304
QwenLM/Qwen
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
Language:Python12.8k1k
openai/tiktoken
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
Language:Python11.2k758
OpenBMB/MiniCPM
MiniCPM-2B: An end-side LLM outperforming Llama2-13B.
Language:Python4.5k321
ChenghaoMou/text-dedup
All-in-one text de-duplication
Language:Python55368
lixin4ever/Conference-Acceptance-Rate
Acceptance rates for the major AI conferences
Language:Jupyter Notebook4k288
RUC-GSAI/Yulan-GARDEN
Official Repository for SIGIR2024 Demo Paper "An Integrated Data Processing Framework for Pretraining Foundation Models"
Language:Python405
DaoD/ResearchFigure
Some example codes for drawing figures in research paper
Language:Python255
openai/evals
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
Language:Python14.4k2.6k
noanabeshima/wikipedia-downloader
Downloads 2020 English Wikipedia articles as plaintext
Language:Python194
openai/gpt-2
Code for the paper "Language Models are Unsupervised Multitask Learners"
Language:Python22k5.4k
ray-project/llm-numbers
Numbers every LLM developer should know
4k138
modelscope/data-juicer
A one-stop data processing system to make data higher-quality, juicier, and more digestible for (multimodal) LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据！
Language:Python1.8k122
liyucheng09/Contamination_Detector
Lightweight tool to identify Data Contamination in LLMs evaluation
Language:Python321
LDNOOBW/List-of-Dirty-Naughty-Obscene-and-Otherwise-Bad-Words
List of Dirty, Naughty, Obscene, and Otherwise Bad Words
2.8k656

Emanual20

Emanual20's Stars

315386775/DeepLearing-Interview-Awesome-2024

mlfoundations/open_clip

vllm-project/vllm

mlfoundations/MINT-1T

NLP2CT/LLM-generated-Text-Detection

THUDM/GLM-4

lyy1994/awesome-data-contamination

openai/gpt-2-output-dataset

RUCAIBox/LLMBox

liyucheng09/LatestEval

acl-org/acl-style-files

srush/Tensor-Puzzles

DjangoPeng/LLM-quickstart

liyucheng09/llm-compressive

meta-llama/llama3

wangshusen/RecommenderSystem

QwenLM/Qwen

openai/tiktoken

OpenBMB/MiniCPM

ChenghaoMou/text-dedup

lixin4ever/Conference-Acceptance-Rate

RUC-GSAI/Yulan-GARDEN

DaoD/ResearchFigure

openai/evals

noanabeshima/wikipedia-downloader

openai/gpt-2

ray-project/llm-numbers

modelscope/data-juicer

liyucheng09/Contamination_Detector

LDNOOBW/List-of-Dirty-Naughty-Obscene-and-Otherwise-Bad-Words