Advanced Natural Language Processing Techniques to Profile Cybercriminals

❗ This is my thesis project book, written in Spanish.

Abstract

Different commonly used Data Science methodologies are identified for structured and non-structured data analysis. After that a State of the Art exploration of current techniques for cybercriminal profiling and data classification is made. Lastly, 3 Natural Language Processing model proposals are made to aid State security agents in their tasks against crime.

The whole compiled book is in the main.pdf file.

Prerequisites for compiling

In order to compile this project you can either clone this project from the Overleaf page, or if you want to do it locally, the easiest way to get all the LaTeX dependencies is by getting the TexLive Full installation (~2.5GB of downloading).

You can type the following in a Debian distribution:

$ sudo apt-get install texlive-full

Compiling

I have written a script to do all the necessary steps of compiling the LaTeX files (removing the junk before the compilation):

$ sh make.sh

If you just want to remove all the junk produced by LaTeX you can:

$ sh clean.sh

Author

Alejandro Anzola, Computer Science student

Escuela Colombiana de Ingenieria Julio Garavito

Bogota, Colombia