The code is moved to the repo at https://github.com/INK-USC/procedural-extraction. Sorry for the inconvenience.
Code for paper Eliciting Knowledge from Experts: Automatic Transcript Parsing for Cognitive Task Analysis, in proceedings of ACL 2019
This code provides a framwork for extracting procedural information from documents. Please refer to our ACL paper (arXiv) for further descriptions.
- Run install script to download word embeddings, install pre-trained models
bash install.sh
- Start StandfordCoreNLP Server (please refer to here if your are not familiar with StandfordCoreNLP). This server is required for tokenization in preprocessing and pattern-based extraction.
You should customize the StandfordCoreNLP Server URL in
java -mx20g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer --port 9000
pattern_extraction/corenlp.py
by modifyingnlp_server = StanfordCoreNLP('http://YOUR_SERVER:YOUR_PORT')
Step-by-step commands for running experiment on Relation Classification with setting:
- Model = Mask_max
- Fuzzy-matching Method = Glove 300d
- Context-level K = 2
- Sampling Portion = 4:2:1
Please refer to the ArgumentParser in code for more details.
Preprocessed Relation Classification Dataset is available in preprocessed/relation.tar.gz
.
Drop the extracted relation
folder into dataset
folder.
Then you can run the Model of Relation Classfication.
Preprocessed dataset is available in the Preprocessed Dataset Section. ...or you can do preprocessing on your own by running following commands...
- Run preprocessing script. The example script is parsing data from protocol files and doing fuzzy matching with Glove 300d embedding.
bash script/run_matching.sh
- Create Relation Classification Dataset with the extracted data.
python create_dataset.py relation --path dataset/relation/embavg-2 --dir_extracted extracted/embavg
- Build Manual Matching Testset, from manual matching annotation
fuzzy_matching/answer.json
ondocument 01
# Manual Matching python extract_samples.py 1 manual --dir_extracted extracted/manual # Create Relation Classification Dataset for context level K=2 python create_dataset.py relation --path dataset/relation/manual-2 --k_neighbour 1 --dir_extracted extracted/manual --dataset 1 # Copy to Fuzzy Matching dataset as its manual-matching testset cp dataset/relation/manual-2/nonsplit.pkl dataset/relation/embavg-2/manual.pkl
- Train Classification Model for 5 times
bash run_relation.sh
- Get Averaged Result
python script/reduce.py maskmax-k2 cat logs/maskmax-k2-avg/metrics.json
You can also create sequence labeling dataset with the fuzzy-matching result:
mkdir -p dataset/seqlab/embavg
python create_dataset.py seqlabel --dir_extracted extracted/embavg --path dataset/seqlab/embavg
SeqLab dataset with IOBES format is created in folder dataset/seqlab/embavg
Please refer to Standford-NER and LM-LSTM-CRF
Setting | Argument | Gen. Acc. | Gen. F1 | Manual Acc. | Manual F1 |
---|---|---|---|---|---|
BERT | none | 81.6 ± 1.0 | 70.1 ± 1.7 | 77.2 ± 2.7 | 62.2 ± 6.1 |
Context Pos. as Attention | posattn | 82.5 ± 1.5 | 72.2 ± 2.6 | 81.2 ± 4.7 | 72.7 ± 7.5 |
Context Pos. as Input Emb. | segemb | 82.8 ± 1.4 | 72.7 ± 1.9 | 78.8 ± 8.5 | 67.4 ± 8.1 |
Hidden States Masking (Avg) | mask* | 80.5 ± 2.7 | 69.0 ± 5.7 | 80.4 ± 7.1 | 73.4 ± 7.9 |
Hidden States Masking (Max) | mask | 82.3 ± 1.4 | 72.6 ± 3.0 | 87.6 ± 1.5 | 81.4 ± 2.4 |
*Enable Hidden States Masking with Average pooling with the commented code in models/bert_modeling_mask.py
- Fuzzy-matching Method = Glove 300d
- Context-level K = 2
- Sampling Portion = 4:2:1
6 documents in folder data
for creating dataset
- *.src.txt: Copy from original source (transcript) files (seperated by \n)
- *.src.ref.txt: Line-by-line extracted by utils/copyLineByLine.vbs from original source files, with line-break shown in Microsoft Word
- *.tgt.txt: Original target (protocol) files (seperated by \n)
document 05
is not used in our experiments since the line numbers in protocol is not accurate.
Manual-matching testset is created on document 01
, so a small portion of samples in the manual-matching testset would be overlapping with samples in fuzzy-matching trainset (though they are extracted by different methods and are resampled). This is a flaw of the manual testset setting.
If you find our work useful in your research, please consider citing:
@inproceedings{du2019acl,
title={Eliciting Knowledge from Experts: Automatic Transcript Parsing for Cognitive Task Analysis},
author={Junyi, Du and He, Jiang and Jiaming, Shen and Xiang, Ren},
booktitle={Proceedings of ACL 2019},
year={2019}
}