tytemp's Stars
km1994/llms_paper
该仓库主要记录 LLMs 算法工程师相关的顶会论文研读笔记(多模态、PEFT、小样本QA问答、RAG、LMMs可解释性、Agents、CoT)
epfl-dlab/forc
Framework for Cost-Effective Language Model Choice
binzhouchn/deep_learning
模拟神经元功能和网络结构,来完成认知任务的一类机器学习算法
Guang000/Awesome-Dataset-Distillation
A curated list of awesome papers on dataset distillation and related applications.
thunlp/THUOCL
THUOCL(THU Open Chinese Lexicon)中文词库
henryhust/SuicideDetector
面向多语言的轻生倾向文本检测器
NiuTrans/Classical-Modern
非常全的文言文(古文)-现代文平行语料
rime-aca/corpus
古典中文語料庫
NLP2CT/norm-nmt
Norm-Based Curriculum Learning for Neural Machine Translation (ACL 2020)
kpu/kenlm
KenLM: Faster and Smaller Language Model Queries
jiali-ms/JLM
A fast LSTM Language Model for large vocabulary language like Japanese and Chinese
yandex/faster-rnnlm
Faster Recurrent Neural Network Language Modeling Toolkit with Noise Contrastive Estimation and Hierarchical Softmax
FengZiYjun/CharLM
Character-aware Neural Language Model implemented by PyTorch
mattzheng/py-kenlm-model
python | 高效使用统计语言模型kenlm:新词发现、分词、智能纠错等
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
sangmichaelxie/doremi
Pytorch implementation of DoReMi, a method for optimizing the data mixture weights in language modeling datasets
OpenBMB/BMPrinciples
A collection of phenomenons observed during the scaling of big foundation models, which may be developed into consensus, principles, or laws in the future
shayne-longpre/a-pretrainers-guide
togethercomputer/RedPajama-Data
The RedPajama-Data repository contains code for preparing large datasets for training large language models.
OpenMOSS/MOSS
An open-source tool-augmented conversational language model from Fudan University
lonePatient/awesome-pretrained-chinese-nlp-models
Awesome Pretrained Chinese NLP Models,高质量中文预训练模型&大模型&多模态模型&大语言模型集合