Does not parse the page vk.com
Vponed opened this issue · 1 comments
Vponed commented
raw_html = requests.get('https://vk.com/neurosciencenews').text
results = Extractor().extract(raw_html)
It does not return almost anything. Why it can be? It works great with other sites.
Also, I would like to know more about manipulations with the extractor. It is very interesting whether it is possible to obtain from it not only data, but also the way in which he extracted them.
theblackcat102 commented
My guess is this page is a client side generated site which the content are loaded after the website was loaded. Using requests only returns empty web page ( contents are not yet loaded ). You might need to render the page and try again.
You can view these two files for understanding how the extraction works