significant portion of content missed by readability
robh71 opened this issue · 1 comments
robh71 commented
We are processing the text from https://www.fiolinjurylaw.com/ using readability and a much of the content is missing.
I've attached the readability output as generated by:
$ python -m readability.readability -u https://www.fiolinjurylaw.com/
and the html output downloaded with curl:
$ curl https://www.fiolinjurylaw.com/
fiolinjurylaw.curl.html.txt
fiolinjurylaw.readability.html.txt
Thanks
SkyloveQiu commented
I am not sure if it's too late but you can try positive words.