The second part of the detection, some files can not push because Github don't recognize.
So, I show the collaboration with my partner Rodrigo Piñon, and we find Ads in different kinds of websites, different idioms (spanish, english, french, italian and portuguese).
We use the following python libraries: Selenium, Beautiful Soup (BS4), Pandas, openCV and tesseract-OCR.