A series of natural language processing notebooks and implemented models for various tasks from low-level tasks, such as stemming, lemmatization, POS-tagging, to high level tasks such as sentiment analysis and summarization. Introduces a variety of models for these tasks from rule-based to traditional ML models to RNNs.
- Stemming and lemmatization
in progress
- POS tagging and Named Entity Recognition
in progress
- Word Embeddings
in progress
- Language models
in progress
- Syntactic Parsing (Dependency and CFG Parsing)
in progress
- Sentiment and Topic Modeling
in progress
- Summarization
in progress
- Averaged Perceptron POS tagger
- Latent Dirichlet Allocation via Collapsed Gibbs Sampler
- word2vec with negative sampling, subsampling and adjustable context windows
- LSTM + Linear Chain CRF for named entity recognition
spaCy
- for abstracting low level tasks in higher level onesscikit-learn
- for implementing feature extraction and machine learning modelstensorflow
- for implementing deep learning modelskeras
- for higher level deep learning model implementationmatplotlib
- for visualizations