/Term_Document_Matrix

I worked on this during my big data class. We were given a simple text file called "abs1". We were asked to seperated the file into sub documents using linear regression. Once that is done about 865 sub files were created. Once that was done we had to find the number of occurunces for words like "Customers", "Stake Holders", "Investors". The output should print the number of occurences for this words for all the sub documents seperately. For better view I have also added the excel file. So on running the code along with printing the results in the output it will also create an excel file with the outputs printed in proper manner for better view or using the data in future to represent it through graphs.

Primary LanguageHTML

Term_Document_Matrix

I worked on this during my big data class. We were given a simple text file called "abs1". We were asked to seperated the file into sub documents using linear regression. Once that is done about 865 sub files were created.

Once that was done we had to find the number of occurunces for words like "Customers", "Stake Holders", "Investors".

The output should print the number of occurences for this words for all the sub documents seperately. For better view I have also added the excel file. So on running the code along with printing the results in the output it will also create an excel file with the outputs printed in proper manner for better view or using the data in future to represent it through graphs.

Language used: Python