This is a codebase to match reviewers for the ACL Conferences. It is based on paper matching between abstracts of submitted papers and a database of papers from Semantic Scholar.
For ACL 2021, Part 1 and Part 2 Step 1-5 were used. Part 3 was not used.
Clone this github repository to your local machine, saved to some local directory. Every step after this will be carried out in the directory where you saved the repository.
pip install -r reviewer-paper-matching/requirements.txt
There are two options. Choose Option B if your disk space is limited. Both
options create the scratch
folder where we will save the data and output files
for the system, as well as scratch/acl-anthology.json
, which contains the
information for all the relevant ACL papers in the dump.
(Option A): Download Entire Dump
Download the semantic scholar dump following the instructions. Note that the dump takes up about 150 GB of disk space, so Option B is recommended if your available diskspace is any more constrained than this.
Wherever you downloaded the semantic scholar dump, make a virtual link to the
s2
directory here.
ln -s /path/to/semantic/scholar/dump s2
Make a scratch directory where we'll put working files
mkdir -p scratch/
For efficiency purposes, we'll filter down the semantic scholar dump to only papers where it has a link to aclweb.org, a rough approximation of the papers in the ACL Anthology.
zcat s2/s2-corpus*.gz | grep aclweb.org > scratch/acl-anthology.json
(Option B): Iteratively Download and Filter Dump
Find the numeric date associated with the Semantic Scholar release you want
(e.g. 2021-01-01), and use the download_s2.sh
script to iteratively download a
chunk of the release and filter the ACL entries into acl-anthology.json
. Once
all ACL entries have been added to the anthology file, the s2 chunk will be
deleted from disk. This script will automatically create the scratch
folder as
well as scratch/acl-anthology.json
. The necessary arguments to the script are
the release date and the number of chunks you want the release to be broken into
(based on your disk capacity):
./reviewer-paper-matching/download_s2.sh 2021-01-01 10
This step requires a GPU. Alternatively, you may contact the authors to get a model distributed to you.
(4a) Download suplemental data for training the similarity model:
bash reviewer-paper-matching/download_sts17.sh
(4b) Word-tokenize the ACL anthology abstracts:
python reviewer-paper-matching/tokenize_abstracts.py \
--infile scratch/acl-anthology.json \
--outfile scratch/abstracts.txt
(4c) Train a sentencepiece model on the abstracts and use it to further tokenize them:
python reviewer-paper-matching/sentencepiece_abstracts.py \
--infile scratch/abstracts.txt \
--vocab-size 20000 \
--model-name scratch/abstracts.sp.20k \
--outfile scratch/abstracts.sp.20k.txt
(4d) Train the text similarity model on the abstracts:
python -u reviewer-paper-matching/train_similarity.py \
--data-file scratch/abstracts.sp.20k.txt \
--model avg \
--dim 1024 \
--epochs 20 \
--ngrams 0 \
--share-vocab 1 \
--dropout 0.3 \
--batchsize 64 \
--megabatch-size 1 \
--megabatch-anneal 10 \
--seg-length 1 \
--outfile scratch/similarity-model.pt \
--sp-model scratch/abstracts.sp.20k.model 2>&1 | tee scratch/training.log
(1a) Create scatch/Profile_Information.csv
from softconf:
In START, go to "Spreadsheet Maker", and download the spreadsheet for
"Author/Reviewer Profiles" as a csv. The necessary fields to download for this
spreadsheet are Username
, Email
, First Name
, Last Name
,
Semantic Scholar ID
, Roles
, PCRole
, and emergencyReviewer
. Note: The
ACL COI-detection scripts also use the author/reviewer profiles as input, and
require the additional fields Affiliation
, Affiliation Type
,
Previous Affiliations
(sometimes called Previous Affiliations (by domain)
),
COIs Entered on Global Profile
, Year of Graduation
, and Backup Email
(1b, optional) Create scratch/quotas.csv from softconf:
In the same page of the Spreadsheet Maker, download a csv with the fields
Username
and Quota for Review
. Save this as quotas.csv
(1c) Create scratch/Submission_Information.csv from softconf:
In the Spreadsheet Maker, download the spreadsheet for "Submissions" as a csv.
The necessary fields to download for this spreadsheet are Submission ID
,
Title
, Track
, Submission Type
, Abstract
, Authors
,
Author Information
, All Author Emails
, and Acceptance Status
.
The file bids.csv
indicates COI relationships (as a matrix) between reviewers
(columns) and submissions (rows).
To generate this file, we can use softconf and/or an external COI-detection system, resulting in three options:
- run step 2a only
- run step 2b only
- run step 2a first, ant then step 2b (which takes the output of 2a as input)
We recommend the third option.
(2a) Get scratch/start_bid.csv from softonf:
In the Spreadsheet Maker, download the spreadsheet for "Bids" as a csv as
scratch/start_bids.csv
.
(2b) Run the external COI-detection system to create scratch/bids.csv:
Contact ACL for the github site for the COI-detection system. The code will use
scratch/start_bids.csv
from Step 2a and add more COIs. The output of this
system is scratch/bids.csv
. If Step 2a is not run first, this system will give
a warning.
The COI-detection system and instructions can be found here
python -u reviewer-paper-matching/softconf_extract.py \
--profile_in=scratch/Profile_Information.csv \
--submission_in=scratch/Submission_Information.csv \
--bid_in=scratch/bids.csv \
--reviewer_out=scratch/reviewers.jsonl \
--submission_out=scratch/submissions.jsonl \
--bid_out=scratch/cois.npy \
[--non_reviewers_out=scratch/non-reviewers.jsonl \]
[--multi_track_reviewers_out=scratch/multi-track.jsonl \]
[--reject_out=scratch/rejects.jsonl]
The first three arguments are input files (created in Step 1 and 2), and the next three arguments are the base output files. This step processes the input files (e.g., use RegEx to clean some data) and saves the results to jsonl and npy files.
The bid_in
argument (if provided) comes from Step 2. If this argument is not
provided, no COIs will be initialized.
non_reviewers_out
, multi_track_reviewers_out
, and reject_out
are all
optional. non_reviewers_out
is a file to store person profiles that are not
reviewers. multi_track_reviewers_out
will store reviewers who are assigned to
more than one track. reject_out
will store papers that have been desk-rejected
or withdrawn. All of these are for debugging purposes, and are not necessary for
the paper assignment to run.
python -u reviewer-paper-matching/solution_viability_check.py \
--reviewer_file=scratch/reviewers.jsonl \
--submission_file=scratch/submissions.jsonl \
--default_paper_quota=6 \
--reviewers_per_paper=3 \
--quota_file=scratch/quotas.csv > viability_check.txt
This script outputs a diagnostic checking whether there are enough reivewers and
ACs in each track, given quotas and default max paper assignments.
reviewer_file
and submission_file
come from Step 3. quota_file
is
generated in Step 1b. The script writes to stdout, and shows information for
each track. The Warning
lines can be searched to make sure there is a possible
solution.
python -u reviewer-paper-matching/suggest_reviewers.py \
--db_file=scratch/acl-anthology.json \
--model_file=scratch/similarity-model.pt \
--submission_file=scratch/submissions.jsonl \
--reviewer_file=scratch/reviewers.jsonl \
--bid_file=scratch/cois.npy \
--<save|load>_paper_matrix=scratch/paper_matrix.npy \
--<save|load>_aggregate_matrix=scratch/aggregate_matrix.npy \
--max_papers_per_reviewer=6 \
--min_papers_per_reviewer=0 \
--reviews_per_paper=3 \
--num_similar_to_list=6 \
[--quota_file=scratch/quotas.csv \]
[--area_chairs \]
[--track \]
--suggestion_file=scratch/reviewer_assignments.jsonl \
[--assignment_spreadsheet=scratch/reviewer_assignments.csv]
The first two arguments come from the model training stage (see Part 1). The next three arguments come from Step 3.
You can modify reviews_per_paper
, min_papers_per_reviewer
, and
max_papers_per_reviewer
to change the number of reviews assigned to each paper
and max number of reviews per reviewer. num_similar_to_list
determines how
many similar reviewers per submission will be listed outside of the actual
assignment.
The <save|load>
arguments are useful to cache progress between separate runs
of the script, for instance if you change any of the parameters. For example,
when you first run the script, a similarity matrix will be calculated between
the submissions and the reviewers' papers (paper_matrix
) as will an aggregated
matrix between submissions and reviewers (aggregate_matrix
). In order to not
have to re-calculate this matrix later, you would include the following in your
first run:
--save_paper_matrix=scratch/paper_matrix.npy \
--save_aggregate_matrix=scratch/aggregate_matrix.npy \
And then to re-use those matrices in subsequent runs, include:
--load_paper_matrix=scratch/paper_matrix.npy \
--load_aggregate_matrix=scratch/aggregate_matrix.npy \
The aggregation step takes a long time, so caching this matrix is recommended.
quota_file
, --track
, and --area_chairs
are all optional arguments.
quota_file
is a csv file downloaded from START containing the Username and
Quota (created in Step 1b). If no quota is provided,
the max will be max_papers_per_reviewer
for every reviewer. If the --track
flag is included, papers will only be assigned to reviewers in the same track.
If the --area_chairs
flag is included, papers will only be assigned to area
chairs (for running AC assignment).
At this point, you will have assignments written to scratch/assignments.jsonl
.
assignment_spreadsheet
is an additional optional parameter designating a base
file to which to write assignments in csv format. This will generate a global
assignment spreadsheet, as well as more detailed ones for each track in the same
directory. For instance, if you include
--assignment_spreadsheet=scratch/assignments.csv
the global file will be scratch/assignments.csv
, and the track-wise files will
follow the format scratch/assignments_<track_name>.csv
.
This option will also produce .txt
files that can uploaded directly to
softconf. Therefore, Step 6 and 7 are optional if this parameter is specified.
You will then have assignments written to scratch/assignments.jsonl
. After
you've output the suggestions, you can also print them (and save them to a file
scratch/assignments.txt
) in an easier-to-read format by running:
python suggest_to_text.py < scratch/assignments.jsonl | tee scratch/assignments.txt
If you want to turn these into START format to re-enter them into START, you can run the following command:
python softconf_package.py \
--suggestion_file scratch/assignments.jsonl \
--softconf_file scratch/start-assignments.csv
then import scratch/start-assignments.csv
into start using the data import
interface.
NOTE: This step also requires an additional dependency, you can install it
via pip install slugify
In order to evaluate the system, you will need either (a) a conference with gold-standard bids, or (b) some other way to create information about bids automatically.
(1a): If you have gold-standard bids, in START, go to the
"Spreadsheet Maker" and download CSV spreadsheets for "Submissions,"
"Author/Reviewer Profiles," and "Bids" saving them to
scratch/Submission_Information.csv
and scratch/Profile_Information.csv
, and
scratch/Bid_Information.csv
. Then run the following:
python softconf_extract.py \
--profile_in=scratch/Profile_Information.csv \
--submission_in=scratch/Submission_Information.csv \
--bids_in=scratch/Bid_Information.csv \
--reviewer_out=scratch/reviewers.jsonl \
--submission_out=scratch/submissions.jsonl \
--bid_out=scratch/bids.npy
(1b): If you do not have this information from START, do the same as above
but without --bids_in
. Then you will have to create a numpy array bids.npy
where rows correspond to submissions, and columns correspond to the bids of
potential reviewers (0=COI, 1=no, 2=maybe, 3=yes).
Follow the steps in the previous section on reviewer assignment to generate
assignments.jsonl
.
python evaluate_suggestions.py \
--suggestion_file=scratch/assignments.jsonl \
--reviewer_file=scratch/reviewers.jsonl \
--bid_file=scratch/bids.npy
This will tell you the ratio of assignments that were made to each bid, where in general a higher ratio of "3"s is better.
The method for suggesting reviewers works in three steps. First, it uses a model to calculate the perceived similarity between paper abstracts for the submitted papers and papers that the reviewers have previously written. Then, it aggregates the similarity scores for each paper into an overall similarity score between the reviewer and the paper. Finally, it performs a reviewer assignment, assigning reviewers to each paper given constraints on the maximimum number of papers per reviewer.
The method for measuring similarity of paper abstracts is based on the subword or character averaging method described in the following paper:
@inproceedings{wieting19simple,
title={Simple and Effective Paraphrastic Similarity from Parallel Translations},
author={Wieting, John and Gimpel, Kevin and Neubig, Graham and Berg-Kirkpatrick, Taylor},
booktitle={Proceedings of the Association for Computational Linguistics},
url={https://arxiv.org/abs/1909.13872},
year={2019}
}
We trained a model on abstracts from the ACL anthology to attempt to make the similarity model as topically appropriate as possible. After running this model, this gives us a large matrix of similarity scores between submitted papers, and papers where at least one of the reviewers in the reviewer list is an author on the paper.
Next, we aggregate the paper-paper matching scores into paper-reviewer matching
scores. Assume that S_r
is the set of all papers written by a reviewer r
,
and p
is a submitted paper. The paper-reviewer matching score of this paper
can be calculated in a number of ways, but currently this code uses the
following by default:
match(p,r) = max(match(S_r,p)) + second(match(S_r,p)) / 2 + third(match(S_r,p)) / 3
Where max
, second
, and third
indicate the reviewer's best, second-best,
and third-best matching paper's scores, respectively. The motivation behind this
metric is based on the intuition that we would both like a reviewer to have
written at least one similar paper (hence the max
) and also have experience in
the field moving beyond a single paper, as the one paper they have written might
be an outlier (hence the second
and third
).
The code for this function, along with several other alternatives, can be found
in suggest_reviewers.py
's calc_aggregate_reviewer_score
function.
Finally, the code suggests an assignment of reviewers. This is done by trying to
maximize the overall similarity score for all paper-reviewer matches, under the
constraint that each paper should have exactly reviews_per_paper
reviewers and
each reviewer should review a maximum of max_papers_per_reviewer
reviews. This
is done by solving a linear program that attempts to maximize the sum of the
reviewer-paper similarity scores based on these constraints.
This code was written/designed by Graham Neubig, John Wieting, Arya McCarthy, Amanda Stent, Natalie Schluter, and Trevor Cohn.