wuhenbai's Stars
chinese-poetry/chinese-poetry
The most comprehensive database of Chinese poetry 🧶最全中华古诗词数据库, 唐宋两朝近一万四千古诗人, 接近5.5万首唐诗加26万宋诗. 两宋时期1564位词人,21050首词。
Xirider/finetune-gpt2xl
Guide: Finetune GPT2-XL (1.5 Billion Parameters) and finetune GPT-NEO (2.7 B) on a single GPU with Huggingface Transformers using DeepSpeed
facebookresearch/MetaICL
An original implementation of "MetaICL Learning to Learn In Context" by Sewon Min, Mike Lewis, Luke Zettlemoyer and Hannaneh Hajishirzi
karpathy/minGPT
A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
LAION-AI/laion-datasets
Description and pointers of laion datasets
davidvrba/Stackoverflow-Data-Analysis
Analysis of Stackoverflow dataset
Shark-NLP/DiffuSeq
[ICLR'23] DiffuSeq: Sequence to Sequence Text Generation with Diffusion Models
blmoistawinde/HarvestText
文本挖掘和预处理工具(文本清洗、新词发现、情感分析、实体识别链接、关键词抽取、知识抽取、句法分析等),无监督或弱监督方法
bytedance/matxscript
A high-performance, extensible Python AOT compiler.
GaoPeng97/transformer-xl-chinese
transformer xl在中文文本生成上的尝试(可写小说、古诗)(transformer xl for text generation of chinese)
CLUEbenchmark/CLUEDatasetSearch
搜索所有中文NLP数据集,附常用英文NLP数据集
huggingface/trl
Train transformer language models with reinforcement learning.
sail-sg/poolformer
PoolFormer: MetaFormer Is Actually What You Need for Vision (CVPR 2022 Oral)
zjunlp/DeepKE
[EMNLP 2022] An Open Toolkit for Knowledge Graph Extraction and Construction
Morizeyao/GPT2-Chinese
Chinese version of GPT2 training code, using BERT tokenizer.
xiaolincoder/CS-Base
图解计算机网络、操作系统、计算机组成、数据库,共 1000 张图 + 50 万字,破除晦涩难懂的计算机基础知识,让天下没有难懂的八股文!🚀 在线阅读:https://xiaolincoding.com
fuergaosi233/wechat-chatgpt
Use ChatGPT On Wechat via wechaty
nghuyong/cscd-ns
code and data for "CSCD-NS: a Chinese Spelling Check Dataset for Native Speakers"
letiantian/Pinyin2Hanzi
拼音转汉字, 拼音输入法引擎, pin yin -> 拼音
CrazyBoyM/dreambooth-for-diffusion
文生图大模型训练工具箱 (完整封装stable diffusion全量微调训练流程, 可训练定制自己的独特风格、人物概念,开箱即用, 含自动图像标注、权重转换、训练参数配置等)
zjunlp/OpenUE
[EMNLP 2020] OpenUE: An Open Toolkit of Universal Extraction from Text
BDBC-KG-NLP/QA-Survey-CN
北京航空航天大学大数据高精尖中心自然语言处理研究团队开展了智能问答的研究与应用总结。包括基于知识图谱的问答(KBQA),基于文本的问答系统(TextQA),基于表格的问答系统(TableQA)、基于视觉的问答系统(VisualQA)和机器阅读理解(MRC)等,每类任务分别对学术界和工业界进行了相关总结。
infinilabs/analysis-pinyin
🛵 This Pinyin Analysis plugin is used to do conversion between Chinese characters and Pinyin.
thunlp/DeepTHULAC
A High-Performance Lexical Analyzer for Chinese
TingFree/NLPer-Arsenal
收录NLP竞赛策略实现、各任务baseline、相关竞赛经验贴(当前赛事、往期赛事、训练赛)、NLP会议时间、常用自媒体、GPU推荐等,持续更新中
ghrua/NgramRes
yxuansu/Contrastive_Search_Is_What_You_Need
[TMLR'23] Contrastive Search Is What You Need For Neural Text Generation
eyriewow/merge-models
Merges two latent diffusion models at a user-defined ratio
HillZhang1999/MuCGEC
MuCGEC中文纠错数据集及文本纠错SOTA模型开源;Code & Data for our NAACL 2022 Paper "MuCGEC: a Multi-Reference Multi-Source Evaluation Dataset for Chinese Grammatical Error Correction"
labring/laf
Laf is a vibrant cloud development platform that provides essential tools like cloud functions, databases, and storage solutions. It enables developers to quickly unleash their creativity and bring innovative ideas to life with ease.