laohuangma's Stars
yet-another-account/openwebtext
An open clone of the GPT-2 WebText dataset by OpenAI. Still WIP.
NVIDIA/Megatron-LM
Ongoing research training transformer models at scale
openai/lm-human-preferences
Code for the paper Fine-Tuning Language Models from Human Preferences
mandarjoshi90/SpanBERT
Code for using and evaluating SpanBERT.
niderhoff/big-data-datasets
Curated list of Publicly available Big Data datasets. Uncompressed size in brackets. No Blockchains.
niderhoff/nlp-datasets
Alphabetical list of free/public domain datasets with text data for use in Natural Language Processing (NLP)
awesomedata/awesome-public-datasets
A topic-centric list of HQ open datasets.
topogram/weiboscope-data
Download, extract and index Weiboscope data
Lab41/sunny-side-up
Sentiment Analysis Challenge
jlshix/movielens-douban-dataset
爬取豆瓣 48233 条数据, 与 movielens ml-latest 数据集取交集获取共同数据 15752 条
xiaopangxia/kuakua_corpus
夸夸语料,来自豆瓣互相表扬组数据
codemayq/chinese-chatbot-corpus
中文公开聊天语料库
DeepPavlovAdmin/convai
huggingface/transfer-learning-conv-ai
🦄 State-of-the-Art Conversational AI with Transfer Learning
Accentax/gpt-2
Trainable GPT-2
ftarlaci/GPT2sQA
Fine-tuning GPT-2 Small for Question Answering
graykode/gpt-2-Pytorch
Simple Text-Generator with OpenAI gpt-2 Pytorch Implementation
pytorch/hub
Submission to https://pytorch.org/hub/
Morizeyao/GPT2-Chinese
Chinese version of GPT2 training code, using BERT tokenizer.
ConnorJL/GPT2
An implementation of training for GPT2, supports TPUs
zihangdai/xlnet
XLNet: Generalized Autoregressive Pretraining for Language Understanding
ryankiros/neural-storyteller
A recurrent neural network for generating little stories about images
seraphinatarrant/plan-write-revise
Code for a web demo of Plan, Write, and Revise: a neural system for interactive open-domain story generation
readtedium/eleventy-boilerplate
openai/gpt-2-output-dataset
Dataset of GPT-2 outputs for research in detection, biases, and more
pytorch/cpuinfo
CPU INFOrmation library (x86/x86-64/ARM/ARM64, Linux/Windows/Android/macOS/iOS)
hughbzhang/HUSE
Official Github repo for the paper "Unifying Human and Statistical Evaluation for Natural Language Generation"
minimaxir/gpt-2-simple
Python package to easily retrain OpenAI's GPT-2 text-generating model on new texts
wireshark/wireshark
Read-only mirror of Wireshark's Git repository at https://gitlab.com/wireshark/wireshark. ⚠️ GitHub won't let us disable pull requests. ⚠️ THEY WILL BE IGNORED HERE ⚠️ Upload them at GitLab instead.
beetletrellen1/story_generation