/nlp-project

Primary LanguageJupyter Notebook

Author Identification for Horror Literature

Stylometry: Contains the visualiztion stylometry techniques

Brown: Contains the early attempt to apply the brown clustering algotithm

Data: Contains the splitted train and test data to use in our models

LSTM: Contains the Recurrent Neural Network with different emebeddings

Naive Bayes: Contains the Naive Bayes models with LIME applied - training is fast, so saved models are not included - be sure to put data sets into the same directory before running all cells

Running instructions The notebooks already include a requirement.txt and are set to run on testing mode. Simply run all cells on the desired notebook and the tests will be run. To run train change the boolean do_train at the beginning of the notebook.