Collection of links, reference papers and sample code for studying natural language processing (aka computational linguistics).
Proposal: Create a daemonized Linux process that scrapes and stores twitter and quora data in a PostgreSQL database. Perform Bayesian sentiment analysis on the two data sources.
P( TwitterSentiment | QuoraSentiment) = P(QS | TS) P(TS) / P(QS)
- NLP Overview - Modern Deep Learning Techniques Applied to Natural Language Processing
- NLP Progress - current state-of-the-art for the most common NLP tasks
- Fuzzy C-means clustering in R
- awesome-nlp - goog collection of NLP links
- Capstone_2016_us_elections — Emile Badran's final project for Thinkful Data Science project.
- Apache Airflow - pipeline scheduler based on Directed Acyclic Graphs (DAGs)
- Speech and Language Processing by Dan Jurafsky and James H. Martin. Nice looking, covers n-grams, naive bayes classifiers, sentiment, logistic regression, vector semantics, neural nets, part-of-speech tagging, sequence processing with recurrent networks, grammers, syntax, statistical parsing, information extraction. hidden markov models.
- CS224n: Natural Language Processing with Deep Learning See Stanford NLP course Winter 2017
- https://en.wikipedia.org/wiki/Natural_language_processing
- https://en.wikipedia.org/wiki/Non-negative_matrix_factorization
- https://en.wikipedia.org/wiki/Graph_theory
- https://en.wikipedia.org/wiki/Bayesian_network
- https://en.wikipedia.org/wiki/Computational_linguistics
- https://en.wikipedia.org/wiki/Fuzzy_clustering