Maximum Sentence Number in the Output
StevenLau6 opened this issue · 0 comments
StevenLau6 commented
Thank you for sharing the code.
I tested the extractive setting on a different summarization dataset and found there are at most 3 sentences output for each sample. It may meet the requirements of the CNN/DM dataset, but may not be suitable for other dataset, where the target summaries can be longer than 3 sentneces.
So I suggest to modify the code in trainer_ext.py#L275, and use the hyper-parameter self.args.max_tgt_len to control the length of output sequence.
PreSumm/src/models/trainer_ext.py
Line 275 in ce8dc01