NLP_Basics: A Python repository from suneelkumarpentela

The Repository consists of files required to extracting words and filtering frequent,essential words out of them from PDFs extracted from linkedin.

To Use this repository, follow the below steps.

******************** EXECUTION ***********************

The profiles of 50 random people were manually downloaded and present in "Linkedin_Profiles" folder.
Execution of "text_extraction.py" generates "output.txt","output1.csv"

output.txt : Text file containing text extracted from one PDF profile. output1.csv : CSV file containing the data of all 50 profiles under "LinkedIn Profiles" label.

Execution of "frequent_words.py" generates "output2.csv"

output2.csv : CSV file containing the data as well as frequent words in two columns respectively.

4)Execution of "essential_words.py" generates "output3.csv"

output3.csv : CSV file containing the data,frequent words,essential words in three columns respectively.

suneelkumarpentela/NLP_Basics