/act-probe-page-resources

Apify act to load web pages and analyze HTTP resources they request

Primary LanguageJavaScriptApache License 2.0Apache-2.0

act-probe-page-resources

Apify act that sequentially loads a list of URLs in headless Chrome and analyzes HTTP resources requested by each page.

The act does not store its state, i.e. if it crashes it restarts fetching all URLs. Therefore you should only use it for short lists of URLs.

The implementation uses Chrome Debug Protocol rather than Selenium. See the following resources for more details:

INPUT

Input is a JSON object with the following properties:

{
    // Array of URLs to open by the browser.
    urls: [String],

    // Indicates how to long should the browser wait for pending resources after page has been loaded.
    // By default it is 0.
    waitSecs: Number,

    // If set to true, Chrome prints verbose log.
    // By default false.
    verboseLog: Boolean,

    // Optional HTTP headers to use for all requests
    headers: Object,
}

OUTPUT

Output is a JSON object such as:

[
  {
    "url": "https://www.apifier.com",
    "requests": {
      "72.1": {
        "url": "https://www.apifier.com/",
        "method": "GET",
        "requestedAt": "2017-08-10T13:40:39.184Z",
        "status": 200,
        "mimeType": "text/html",
        "type": "Document"
      },
      "72.2": {
        "url": "https://cdn.apifier.com/964599760cfc4cc23be6a889c9d825f885cf6af6.css?meteor_css_resource=true",
        "method": "GET",
        "requestedAt": "2017-08-10T13:40:39.233Z",
        "status": 200,
        "mimeType": "text/css",
        "type": "Stylesheet"
      },
      "72.27": {
        "url": "https://widget.intercom.io/widget/kod1r788",
        "method": "GET",
        "requestedAt": "2017-08-10T13:40:40.135Z",
        "redirects": [
          {
            "url": "https://widget.intercom.io/widget/kod1r788",
            "status": 302,
            "location": "https://js.intercomcdn.com/shim.585a31ca.js"
          }
        ],
        "status": 200,
        "mimeType": "application/javascript",
        "type": "Script"
      }
    }
  }
]