niderhoff/nlp-datasets
Alphabetical list of free/public domain datasets with text data for use in Natural Language Processing (NLP)
Issues
- 0
Twitter links broken
#33 opened by Aatmaj-Zephyr - 1
Corporate messaging
#19 opened by benob - 1
Gutenberg dataset
#18 opened by gokceneraslan - 1
Del.icio.us dataset returns 404
#20 opened by nan-wang - 2
Political Social Media data set now points to paid service Appen instead of Crowdflower
#29 opened by dyerdave-cvs - 0
entities: people, businesses, etc.
#22 opened by az0 - 2
dataset with people's dialogues
#14 opened by jasperDD - 1
- 2
Multiple Historical News Headlines Datasets
#4 opened by therohk - 1
You might want to include
#3 opened by mpkuse - 2
Hate speech identification
#1 opened by t-davidson