/article-grabber

Scrape article, images , videos from given link intelligently

Primary LanguageJavaScript

article-grabber

A Node.js module to retrieve article content and metadata from a URL.

Extracting data

var extractor = require('article-grabber');

extractor.extractData('http://somesite.com/apage.html', function (err, data) {
  console.log(data);
});

Extract result

The result looks like this:

{
    "domain": "somesite.com",
    "author": "Jhone doe",
    "title": "An article title",
    "summary": "Lorem ipsum dolor sit amet, consectetur adipisicing elit. Qui, maxime?",
    "content": "<h1>Lorem ipsum dolor sit.</h1><p>Lorem ipsum dolor sit amet, consectetur adipisicing elit. Tempore, possimus.</p><h2>Lorem ipsum dolor sit amet.</h2><p>Lorem ipsum dolor sit amet, consectetur adipisicing elit. <img src="image.png" alt=""> Pariatur dolor deleniti esse repellendus accusamus ducimus aut molestias optio obcaecati similique.</p>"
}