A simple tool that:
- Attaches to a running Chrome instance
- Navigates to a given url
- Injects Readbility.js
- Waits until a website loads
- Runs Readability and returns an extracted article
- Copies the extracted article to clipboard
Headless mode allows much simpler Chrome integration in a server environment and gives Chrome speed and reliability for various web automation tools.
Headless Chrome is still under development therefore this guide can change.
This tool uses Chrome Debugging Protocol to control a Chrome instance.
Install Chrome from a development channel (until headless mode appears in a stable chrome version):
https://www.google.com/chrome/browser/?platform=linux&extra=devchannel
Headless Chrome isn't executed automatically, because it's more reliable to run and manage its process separately.
google-chrome-unstable --headless --user-data-dir=/tmp/chrome-new-data --remote-debugging-port=9222
git clone https://github.com/chiraag-kakar/read_from_chrome
var chromeReadability = require('./chrome-readability');
var url = 'https://hackernoon.com/13-major-challenges-faced-while-building-a-community-experts-roundup-e3a59aa28fbf';
chromeReadability.extract(url, function(err, article) {
console.log(article.content);
});
Note : It will only work for DOM objects
navigator.clipboard.writeText(textData.editor)
To make it work for all objects , we can refer this module.
- Unlike the normal Chrome mode, the headless mode currently supports only one tab per instance, it means to use this script in parallel more instances on different ports must be executed.