duzun/hQuery.php

Running with JS enabled

Honest-Objections opened this issue · 1 comments

Guessing I know the answer to this, but the site I was scraping data from recently required JS to be enabled to view the main contents.

Do you know of any way around this?

duzun commented

TLDR: This library doesn't run JavaScript.

It depends...
Usually, sites requiring JS to render contents fetch that contents from back-end as JSON or XML or even pieces of HTML documents over some XHR requests.
It is very likely that you can fetch the same XHR contents with PHP directly and parse it.
I know, not the answer you expected, but still might help.

If you need JS executed anyways, I think there is no PHP library for the task, cause you need a browser (HTML engine + CSS engine + JS engine).
Try to use Phantom or Chrome in headless mode with some JS API, and maybe send the response to PHP over HTTP.

Maybe someone might come up with a solution to this issue, by building a (virtual) browser for PHP. We have the HTML engine (kind of) and JS engine, which in theory should be enough for scraping.