abisee/cnn-dailymail
Code to obtain the CNN / Daily Mail dataset (non-anonymized) for summarization
PythonMIT
Issues
- 0
Making a Telugu dataset
#42 opened by Charanchekuri - 2
Note about chunking data
#3 opened by abisee - 8
Error: Could not find or load main class edu.stanford.nlp.process.PTBTokenizer
#28 opened by TianlinZhang668 - 0
Issue about the input text
#37 opened by senchfu - 0
run_summarization.py shows TypeError: unsupported operand type(s) for *: 'int' and 'Flag'
#36 opened by yamonc - 2
Making Dataset for Bengali Language
#31 opened by PrithwirajRizu - 0
URL_list
#35 opened by lipanpanpanpan - 2
How to read .story files in python
#23 opened by VikasNS - 0
make_datafiles.py issue
#34 opened by zixiliuUSC - 11
- 11
error while running make_datafile.py
#16 opened by 97yogitha - 0
The size of train.bin in FINISHED_FILES
#33 opened by liang8qi - 0
how to get src/trg part of data
#32 opened - 1
byte error while reproducing the processed data
#30 opened by Wisharyco - 2
New Test
#29 opened by quanghuynguyen1902 - 3
Errors while preparing the dataset
#7 opened by LeenaShekhar - 8
About the txt vertion
#11 opened by WangLilian - 2
- 2
Titles of articles
#21 opened by aburkov - 0
is it really Abstractive approach?
#27 opened by the-black-knight-01 - 1
License?
#22 opened by Shujian2015 - 1
What are these characters in the bin file?
#20 opened by JunjieCheng - 4
Stanford CoreNLP 3.8 not compatible
#5 opened by bwang482 - 1
How to generate dataset for our own article?
#18 opened by Sharathnasa - 4
How to generate the anonymized version?
#19 opened by Oscar860601 - 1
Incorrectly formatted line in vocabulary file
#15 opened by bwang482 - 7
Could you make a branch for python3?
#13 opened by becxer - 2
Naming convention in tokenized dir
#12 opened by ibarrien - 0
- 8
- 3
- 0
- 4