Personality and Stylistic Scoring of Retrieved Utterances

This is a project for personalizing the utterances of a conversational agent (SlugBot for UCSC).

Work done

Extracted utterances of characters from Friends and The Big Bang Theory.
Extracted LIWC, n-gram, POS features from utterances.
Trained and tested a classification model in the extracted features.

Prerequisites

For now, you need just NLTK to run the above codes. Please use Python 3.5+.

How to run the above codes? (Will provide a detailed guide later )

Clone the repository to your machine.

The file clean.py is for cleaning the original dataset provided. The results from the code are stored in each character file named Chandler_all.txt, Ross_all.txt etc. You can run it by,

py clean.py

The other files available now for feature extraction are extract_pos_bigrams.py (for extracting POS bigrams)

arnab64/slugbot

Personality and Stylistic Scoring of Retrieved Utterances

Work done

Prerequisites

How to run the above codes? (Will provide a detailed guide later )