/techqa

Code for IBM TechQA shared Task

Primary LanguagePython

Cheap and Good? Simple and Effective Data Augmentation for Low Resource Machine Reading

This repository contains code for the paper Cheap and Good? Simple and Effective Data Augmentation for Low Resource Machine Reading accepted as a short paper and for presentation at SIGIR'21

  1. Source code to augument training data for TechQA under folder agumentation
  2. Source code to augment training data for PolicyQA under folder agumentation
  3. Clone of evaluation/training scripts from TechQA task organizers that we used for results included in our paper under IBM_BERT
  4. Clone of evaluation/training scripts from Transformers' Github repo version 3.0.2 that we used for results on PolicyQA dataset included in our paper under transformers-3.0.2

Requirements

We do not have particular requirements for our augmentation code. To run training/evaluation scripts, please refer to TechQA's Github repo and Transformers's Github repo for their detailed requirements.

Installation

You can git clone the whole directory into your desired location by running this command:

git clone https://github.com/vanh17/techqa.git

How to augment training data

TechQA

1. Baselines
2. Data Augmentation
3. Data Augmentation + Original Model

PolicyQA

1. Baselines
2. Data Augmentation