Plagiarism Detection Application

Description

The app is based on Streamlit engine and was designed for smooth operation and pleasant experience.

Its purpose is to let researchers, that are interested in using the impostors’ method in the field of authorship analysis and plagiarism detection specifically, to put their assumption and suspicions under a test. With the system, researchers can analyze results and obtain performances of the method in solving the task. With the GUI, changing the parameters and visualizing the results was never easier. Researchers can use the system to obtain predictions and prove innocence of their loved authors.

Install guide

In order to run the application, please follow next steps:

Confirm that you have the latest Anaconda installed, and clean Python 3.9 environment (configured with Anaconda) is available
From the root folder of the project, using any CLI, activate the environment and run setup.py --install, this will do the most of the work.
If all the packages were installed successfully, you will be able to run the application using streamlit run app.py from the same CLI
In the output, you will see the address where the application is available, by default it is http://localhost:8501, but the port could be different.

User instructions

As you access the main page, you will be able to experiment for the name of science, please follow the next steps to understand how to operate the system:

Open the sidebar if it is not already opened (Arrow at the top left of the screen)
Direct the system to the data by providing a full path to the data location. The data location is expected to be a folder, containing sub-folders, one for each author. The sub-folders should contain creations of the authors with .txt extensions.
Choose the required impostors by their names for the experiment, we allow multiple choices, i.e., when you choose 2 authors for First Impostors and 2 authors for Second Impostors, they will be paired according to the order you chose them.
Choose the “Author under test” and “Creation under test” written by him. Please note that for the algorithm proper functioning, the questionable author should have more than one creation in his folder.
Before you run the experiment, on the bottom of the sidebar, you can see the Neural Network hyperparameters, which could be tuned to receive more accurate results.
Now you will be able to run “Analyze Authorship” and wait for the results.
After a while, you will see the results as bar-plots and our approximate prediction, makes you easier to analyze the authorship of “Creation under test”

kmaltcev/NLP-BiLSTM-model

Plagiarism Detection Application

Description

Install guide

User instructions