ML model that recognizes how much the text is related to data of a particular topic which the model is trained with. Modular structure of the code makes it easier to understand and modify it. Here, the model classify the text if it is crime related or not.
pip install -r requirements.txt
>>> import nltk
>>> nltk.download('stopwords')
python TEST.py
You can change the datasetPath
according to your dataset filename and location from path.py
.
You can also change the data processing flow of prepareDocuments(datasetPath,documentsPath,datasetTweetsPath)
in classify.py
.
Note: If you are switching from python2 to python3 or vice versa, You need to delete all the .pickle
files from pickled_classifiers and pickled_data. Then, execute TEST.py again.