ageitgey/node-unfluff

Grabbing sidebar content

adamrabie opened this issue · 0 comments

I noticed while parsing the url below that sidebar content sometimes get drawn into the article content.

http://news.forexlive.com/!/anz-on-gbp-mkts-need-something-fresh-to-trade-off-if-gbp-is-to-go-lower-in-near-term-20170117

I'll be looking into the code here but anyone more familiar with it who can beat me to it is much appreciated.

If someone is interested in optimizing the content extractor for a bunch of URLs i'm commonly parsing, i'd be interested in paying a freelance rate. Not looking for overfitting but hoping to improve this repos general capacity to handle varying content schemas.