Interview-Chatbot

An Interview Chatbot powered by Deep Learning and trained on dataset consisting of Interview Question(Computer Science Domain). Built on TensorFlow v1.11.0 and tensorLayer v1.11.1 and Python v3.6 and Trained the model on NVIDIA DGX-1 V100.

You can download the dataset from here. Here is a sample chat transcript. Bot replies with "Out of Context Question " whenever user ask question from a different domain.


Preety good response right!!

Usage

Step 1: Install required libraries
Step 2: Clone the project
Step 3: Train the model python main.py --batch-size 32 --num-epochs 1000 -lr 0.001
Step 4: Run the model python app.py

You Can install the trained model from here

MethodoLogy

  1. Prepare the Dataset:
    First we need to prepare the dataset. We had prepared the dataset of question for the subjects like Data Structures, Algorithms, Operating System. The dataset contain the question and answers of these subjects. The better the dataset, the more accurate and efficient conversational results can be obtained.

  2. Pre-Processing:
  • Lowercase all the charcters and remove unwanted charcter like - or # or $ etc.
  • Filter the dataset with max question length and max answer length Here we are use 20 for both qmax and amax.
  • Tokenization and Vectorization
  • Add zero padding
  • Split into train,validation,test data

  1. Creation of LSTM,Encoder and Decoder Model:
    LSTM are a special kind of RNN which are capable of learning long-term dependencies. Encoder-Decoder model contains two parts- encoder which takes the vector representation of input sequence and maps it to an encoded representation of the input. This is then used by decoder to generate output.

  2. Train and Save Model:
    We trained the model with 1000 epochs and batch size of 32, Learning rate-0.001, word embedding size was set to 1024, we took categorical cross entropy as our loss function and optimiser used was AdamOptimizer. We got the best results with these parameters. We trained and tested our model on NVIDIA DGX-1 V100. Training accuracy obtained was approximately 99% and validation accuracy of about 80%.

  3. Testing:
    Finally the user can input questions or speak question by clicking Speak now button and bot will reply the answer of the Question. The results obtained are satisfactory according to review analysis.

  4. Submitted By: Kunal Kumar, Syed Muhamed Nihall, Vvsmanideep, Pachipulusu Vamshi Krishna.

    The Project was done under the guidance of Mr. Tejalal and Dr. Vipul Kumar Mishra

Thanks to Suriyadeepan Ramamoorthy for his practical_seq2seq repo, which this repo is based on.