abisee/cnn-dailymail

Code to obtain the CNN / Daily Mail dataset (non-anonymized) for summarization

PythonMIT

Issues

Making a Telugu dataset
#42 opened 8 months ago by Charanchekuri
0
Note about chunking data
#3 opened 7 years ago by abisee
2
Error: Could not find or load main class edu.stanford.nlp.process.PTBTokenizer
#28 opened 6 years ago by TianlinZhang668
8
Issue about the input text
#37 opened 4 years ago by senchfu
0
run_summarization.py shows TypeError: unsupported operand type(s) for *: 'int' and 'Flag'
#36 opened 4 years ago by yamonc
0
Making Dataset for Bengali Language
#31 opened 5 years ago by PrithwirajRizu
2
URL_list
#35 opened 5 years ago by lipanpanpanpan
0
How to read .story files in python
#23 opened 6 years ago by VikasNS
2
make_datafiles.py issue
#34 opened 5 years ago by zixiliuUSC
0
Could you please provide the processed data for cnn/dailymail?
#9 opened 7 years ago by taineleau-zz
11
error while running make_datafile.py
#16 opened 7 years ago by 97yogitha
11
The size of train.bin in FINISHED_FILES
#33 opened 5 years ago by liang8qi
0
how to get src/trg part of data
#32 opened 5 years ago
0
byte error while reproducing the processed data
#30 opened 6 years ago by Wisharyco
1
New Test
#29 opened 6 years ago by quanghuynguyen1902
2
Errors while preparing the dataset
#7 opened 7 years ago by LeenaShekhar
3
About the txt vertion
#11 opened 7 years ago by WangLilian
8
License?
#26 opened 6 years ago by hyandell
2
Titles of articles
#21 opened 7 years ago by aburkov
2
is it really Abstractive approach?
#27 opened 6 years ago by the-black-knight-01
0
License?
#22 opened 6 years ago by Shujian2015
1
What are these characters in the bin file?
#20 opened 7 years ago by JunjieCheng
1
Stanford CoreNLP 3.8 not compatible
#5 opened 8 years ago by bwang482
4
How to generate dataset for our own article?
#18 opened 7 years ago by Sharathnasa
1
How to generate the anonymized version?
#19 opened 7 years ago by Oscar860601
4
Incorrectly formatted line in vocabulary file
#15 opened 7 years ago by bwang482
1
Could you make a branch for python3?
#13 opened 7 years ago by becxer
7
Naming convention in tokenized dir
#12 opened 7 years ago by ibarrien
2
The file does not work with the Python 3.5 even after conversion.
#10 opened 7 years ago by JafferWilson
0
Error in Tokenizing the CNN and DailyMotion stories
#8 opened 7 years ago by JafferWilson
8
Important fix (untokenized data written to .bin files)
#2 opened 7 years ago by abisee
3
Generated .bin files cannot be used for textsum?
#6 opened 8 years ago by bwang482
0
error writing .bin files after tokenizing
#1 opened 8 years ago by prokopevaleksey
4