m-haisham/novelsave

Missing paragraph

Opened this issue · 3 comments

Hi! There seems a missing paragraph to the epub I made from https://chrysanthemumgarden.com/ site.

from the site

20210505_165611.jpg

from the epub I made

20210505_165553.jpg

Could you post the link to the chapter.

The bug was cause by the blacklisted pattern ^[\W\D]*(volume|chapter)[\W\D]+\d+[\W\D]*$ matching the paragraph.

>>> import re
>>> re.match(r'^[\W\D]*(volume|chapter)[\W\D]+\d+[\W\D]*$', 'He clicked to sort the listings by the highest sale volume, and all of them were cheap goods under 30 yuan. The store had countless poor reviews—after all, you get what you paid for—and these were all just bought for the purpose of video chatting with family members, and etc. No matter how poor the quality was, as long as the person could be seen, it was fine.')

<re.Match object; span=(0, 362), match='He clicked to sort the listings by the highest sa>

Telling it was unexpected would be an understatement. good job catching it.