tasos-py/Search-Engines-Scraper

Yahoo title parsing improving

soxoj opened this issue · 1 comments

soxoj commented

I noticed that title of Yahoo is extracted incorrectly:

URL: https://gist.github.com/soxoj/9d65c2f4d3bec5dd25949197ea73cf3a
Title: gist.github.com › soxoj › 9d65c2f4d3bec5dd25949197eamaigret.ipynb · GitHub

Title should be maigret.ipynb · GitHub

I did some fixes form my other project here

Nice catch! I fixed it with bs4's .decompose(), but I'll keep this open in case there is more work to be done