XWikiGen: A Perl repository from DhavalTaunk08

XWikiGen

This repository contains code related to various experiments, which we performed on our dataset (XWikiRef).

Updated dataset link: XWikiRef

Overall it contains 3 directories:

1. extractive:
	- Salience: Experiments containing salience extractive stage
	- HipoRank: Experiments containing hiporank extractive stage
2. abstractive:
	- combined_model: Experiments containing combined model in abstractive stage
	- multidomain: Experiments containing multidomain model in abstractive stage
	- multilingual: Experiments containing multilingual model in abstractive stage
3. evaluation:
    - evaluate_multidomain: Evaluation script for multidomain experiment
    - evaluate_multilingual: Evaluation script for multilingual experiment
    - evaluate_multilingual_multidomain: Evaluation script for multilingual multidomain experiment

The command to run the above experiments are mentioned in the bash file present in each of the directories mentioned above.

    conda create -n xwikigen python=3.8
    conda activate xikigen
    cd XWikiGen/
    pip install -r requirements.txt

To run the salience extractive stage:

    cd extractive/salience/
    bash run_extractive.sh

To run the hiporank extractive stage:

    cd extractive/hiporank/
    bash run_extractive.sh

To run the salience abstractive stage:

    cd abstractive/
    # Go to the desired expriment directory
    bash salience_run.sh

To run the hiporank abstractive stage:

    cd abstractive/
    # Go to the desired expriment directory
    bash hiporank_run.sh

Note: Make sure you make changes to the files path as per your machine.

Below is the directory structure of this repo.

├── extractive
│   ├── HipoRank
│   │   ├── modified_codes
│   │   ├── exp8_run.py
│   │   ├── exp5_run.py
│   │   ├── exp10_run.py
│   │   ├── human_eval_data.jsonl
│   │   ├── ROUGE-1.5.5
│   │   ├── exp2_run.py
│   │   ├── run_extractive.sh
│   │   ├── readme.txt
│   │   ├── exp3_run.py
│   │   ├── .gitignore
│   │   ├── exp6_run.py
│   │   ├── exp4_run.py
│   │   ├── exp9_run.py
│   │   ├── human_eval_sample.ipynb
│   │   ├── plot_ablation.ipynb
│   │   ├── exp11_run.py
│   │   ├── convert_to_pubmed_like.py
│   │   ├── LICENSE
│   │   ├── human_eval_samples.jsonl
│   │   ├── plot_sentence_positions.ipynb
│   │   ├── hipo_rank
│   │   ├── human_eval_results.ipynb
│   │   ├── test.txt
│   │   ├── exp7_run.py
│   │   ├── op_indiv2.txt
│   │   ├── exp_ours.py
│   │   └── dataset_format_sentence_tokenization_individual_sectionwise.py
│   └── salience
│       ├── run_extractive.sh
│       └── extractive.py
├── evaluation
│   ├── evaludate_multidomain.py
│   ├── evaluate_multilingual.py
│   └── evaluate_multilingual_multidomain.py
├── requirements.txt
├── readme.md
└── abstractive
    ├── multilingual
    │   ├── hiporank_run.sh
    │   ├── readme.txt
    │   ├── model
    │   │   ├── dataloader.py
    │   │   └── model.py
    │   ├── saliency_run.sh
    │   ├── testing
    │   │   ├── testing.py
    │   │   └── test.sh
    │   └── train.py
    ├── multidomain
    │   ├── hiporank_run.sh
    │   ├── readme.txt
    │   ├── model
    │   │   ├── dataloader.py
    │   │   └── model.py
    │   ├── saliency_run.sh
    │   ├── testing
    │   │   ├── testing.py
    │   │   └── test.sh
    │   └── train.py
    └── combined_model
        ├── hiporank_run.sh
        ├── model
        │   ├── dataloader.py
        │   └── model.py
        ├── saliency_run.sh
        ├── testing
        │   ├── testing.py
        │   └── test.sh
        └── train.py

DhavalTaunk08/XWikiGen