-
Python 3.8.5
-
Ubuntu 22.04
To set up the environment for this repository, please follow the steps below:
Step 1: Create a Python environment (optional) If you wish to use a specific Python environment, you can create one using the following:
conda create -n pyt1.11 python=3.8.5
Step 2: Install PyTorch with CUDA (optional) If you want to use PyTorch with CUDA support, you can install it using the following:
conda install pytorch==1.11 torchvision torchaudio cudatoolkit=11.3 -c pytorch
Step 3: Install Python dependencies To install the required Python dependencies, run the following command:
pip install -r requirements.txt
- Unzip all the zip files located in the data folder, including its subfolders.
- Place the following folders, extracted from their respective zip files, under the data folder:
kg
,ct
, andgold_subset
- Locate the
local_context_dataset
folder unzipped fromdata/idea-sentence/local_context_dataset.zip
.Move it toidea-sentence/models/T5
. - Find the
local_dataset
folder unzipped fromdata/idea-node/local_dataset.zip
. Place them inidea-node/models/Dual_Encoder
. - Copy the file
e2t.json
and paste it into the following folders:idea-node\models\GPT3.5*\
,idea-node\preprocess\
,idea-sentence\models\GPT3.5*\
, andidea-sentence\preprocess\
- Navigate to the
idea-node\preprocess
and run thebash preprocess.sh
- Navigate to the
idea-sentence\preprocess
and run thebash preprocess.sh
The project data includes the following components:
data/local_context_dataset
: This folder contains the training, validation, and testing files for idea sentence generation.data/local_dataset
: This folder contains the training, validation, and testing files for idea node prediction.data/kg/*.json
: Thedata/kg
directory contains files that store the original Information Extraction (IE) results for all paper abstracts.data/ct/*.csv
: Thedata/ct
directory contains files that represent the citation network for all papers.data/gold_subset
: This directory contains our gold annotation subsets.idea-node/evaluation
andidea-sentence/evaluation
contain sample evaluation code.
To train the model under *\models\*
, run the following command:
bash finetune_*.sh
To test the model under *\models\*
, run the following command:
bash eval_*.sh