All my codebase for IR Class CMP646
On running the parse.py on medium set token count => 193981880 using Cat / | wc -w token count => 193981880
Copy the parse.py file into the top-most directory where all the files are present in the sub-directory.
for /rahul_extra/books-medium/FCE3DE743CEC29EC/annalsofmusicina00laheuoft_ocrml.xml token count => 193677928 for /rahul_extra/books-medium/F40282EF2829178C/biologyintroduct00connrich_ocrml.xml token count => 193816296 for /rahul_extra/books-medium/66E683B4EBBF3C70/broadbroadoceans00jonerich_ocrml.xml token count => 193981880 the final token count is 193981880