readability-scrape is a command line tool to download simplified versions of webpages. Unlike similar tools, it imports Readability.js, the library used in Firefox's reader view, directly from Mozilla's repository so that it will be more up-to-date.
readability-scrape is available via npm:
npm install -g readability-scrape
Pass in the URL of a webpage to retrieve it as plain text:
readability-scrape https://example.com/path
Use the --html
option to get simplified HTML output:
readability-scrape --html https://example.com/path
Use --json
to get Readability's full output as JSON:
readability-scrape --json https://example.com/path
The JSON output will contain at least these properties:
uri
: originaluri
object that was passed to constructortitle
: article titlecontent
: HTML string of processed article contenttextContent
: Processed article content as plain textlength
: length of article, in charactersexcerpt
: article description, or short excerpt from contentbyline
: author metadatadir
: content direction