pgcorpus/gutenberg

Missing newline at the end of counts files

Closed this issue · 1 comments

Apparently all counts files are missing the final newline character.

This creates problems e.g. if one tries to concatenate counts files for several books.

fontclos@fontclos-Dell-PT-3620:~/work/ongoing/gutenberg$ cat data/counts/* | grep 1the | head
disagree	1the	272
menander	1the	3650
trailed	1the	4052
adaptability	1the	3302
accomplishment	1the	4765
pestalozzian	1the	635
fruitless	1the	1411
stones	1the	4642
faintly	1the	6775
refer	1the	678

Solved by #8

(gutenberg) science@science-fontclos:~/gutenberg$ cat data/counts/* | grep 1the
(gutenberg) science@science-fontclos:~/gutenberg$