achen353/TransformerSum

DANCER (Part 1)

achen353 opened this issue · 3 comments

DANCER (PART 1): Implement the ROUGE score calculation between "each sentence in the ground truth abstractive summary" and "each section". Map each sentence in the former to a section.

DUE: 11/20 Saturday 11:59pm

This is not done yet. Need to tune the parameters (e.g. min/max number of sentences/tokens for to-be-summarize text, use Combination or Greedy labeling)

我們 data 的預處理有幾個會影響 preprocessing result 的參數:

@andywang268 能幫我簡單試一下這些嗎(用 branch issue-17-test-params,結果放到 #23):

  1. 預設:oracle_mode="greedy"no_preprocess=False/None
python convert_to_extractive.py ../datasets/billsum_extractive --split_names test --add_target_to test
  1. oracle_mode="greedy"no_preprocess=True
python convert_to_extractive.py ../datasets/billsum_extractive --split_names test --add_target_to test --no_preprocess
  1. oracle_mode="combination"no_preprocess=False/None
python convert_to_extractive.py ../datasets/billsum_extractive --split_names test --add_target_to test --oracle_mode combination
  1. oracle_mode="combination"no_preprocess=True
python convert_to_extractive.py ../datasets/billsum_extractive --split_names test --add_target_to test --oracle_mode combination --no_preprocess

Solved with #22 and #25