This repository contains various resources on NLP projects
It's a jupyter notebook with an example for sentiment analysis. It also serves as an introductory tutorial for Naive Bayes algorithm. The text in notebook contains several typos. Corrected version of text can be found here.
This notebook contains information on setting a basic pipeline for extracting information from images. Tesseract is used for extracting the information from image while NER pipelines are set using Transformer's pipeline and spaCy. An extended version of this notebook is NER_FromImages2. A detailed version along with data can also be found here.
This notebook contains code to extract text data from XML files and then apply name entity recognition to find date, company name, invoice no. etc using spaCy pipeline. A detailed version along with data can also be found here. The data used in the notebook can be found on kaggle here.