/ner_nltk_spacy

Primary LanguageJupyter Notebook

NER Task:

Introduction to spacy and compare the NLTK processing for sample text. File spacy_check_eg_1 is to show, how to import and use spacy for default models. Here model means Spacy has already trained and stored different methods of tokenization, POS tagging, chunking, named entity tagging etc. So just use it for sample run.

NLTK is basic lib for nlp and can be used but Spacy is new dynamic and with latest deep learning models implemented.

spacy_news_ner file is extenstion to use spacy using bs4(Beutifulsoup) library to extract HTML crawled content. The file uses another library named "displacy". It shows inline with jupyter nb highlighted text with tags or labels. It also draws the dependecy graph for a sentence. Visualization of displacy is easy to use and intuitive.