classification of two differen jornals IEEE Transactions on Pattern Analysis and Machine Intelligence (Pattern file) and IEEE Transactions on Systems, Man, and Cybernetics: Systems(systems file) through machine learning using the bag of words method. Workflow:
- get files and identify Titles, paragraph and key words of every article separately.
- identify stop words
- Tokenize
- Lematize
- Classification Classification has been done using only title, paragraph or keywords and using all of then using a Sequential Forward Floating Selection in order to identify the words that best define each jornal.