/Enelpy

A collection of notebooks and model implementations for various NLP tasks

Primary LanguagePython

Enelpy

A series of natural language processing notebooks and implemented models for various tasks from low-level tasks, such as stemming, lemmatization, POS-tagging, to high level tasks such as sentiment analysis and summarization. Introduces a variety of models for these tasks from rule-based to traditional ML models to RNNs.

Notebooks

  • Stemming and lemmatization in progress
  • POS tagging and Named Entity Recognition in progress
  • Word Embeddings in progress
  • Language models in progress
  • Syntactic Parsing (Dependency and CFG Parsing) in progress
  • Sentiment and Topic Modeling in progress
  • Summarization in progress

Models Implemented

  • Averaged Perceptron POS tagger
  • Latent Dirichlet Allocation via Collapsed Gibbs Sampler

Neural Models

  • word2vec with negative sampling, subsampling and adjustable context windows
  • LSTM + Linear Chain CRF for named entity recognition

Libraries and tools used

  • spaCy - for abstracting low level tasks in higher level ones
  • scikit-learn - for implementing feature extraction and machine learning models
  • tensorflow - for implementing deep learning models
  • keras - for higher level deep learning model implementation
  • matplotlib - for visualizations