/HybridQAS

Hybrid Deep Learning for Document Summarization & Non-Factoid Q&A

Primary LanguageJupyter Notebook

Non-factoid Question Answering with Hybrid Summarization

Objective

This research aims to develop a hybrid deep learning system for document summarization and non-factoid question answering. The goal is to provide users with concise and coherent answers to complex questions by integrating extractive, abstractive, and compressive techniques.

Implementation Steps

  1. Hybrid Summarization Architecture

    • Utilize a fine-tuned version of BERT for extractive summarization.
    • Use GPT for abstractive summarization, handling long sequences effectively.
    • Implement Longformer for compressive summarization, preserving important information in long documents.
  2. Training of the Hybrid Summarization Model

    • Use the CNN / Daily Mail dataset for training, a standard benchmark for document summarization tasks.
    • Tokenize the text using appropriate tokenizers for each part of the architecture.
    • Fine-tune the models with the Adam optimizer and explore hyperparameter tuning for optimal performance.
  3. Evaluation

    • Quantitative Evaluation: Measure Rouge-1, Rouge-2, and Rouge-L scores for precision, recall, and F1 score.
    • Qualitative Evaluation: Human evaluators rate summaries based on coherence, relevance, and information completeness.

Expected Outcome

The implementation will result in an innovative deep learning-based system capable of summarizing multiple documents and answering complex non-factoid questions concisely and coherently. The performance of the system will be evaluated and compared to state-of-the-art architectures to assess its effectiveness.