Alpha version 0.8.0 does not follow found links recursively
ralphbolliger opened this issue · 3 comments
Describe the bug
This morning I played around with release version 0.7.8 (yarnpkg). Works fine so far. This afternoon I was curious how alpha version 0.8.0 (github) works. Unfortunately I can't get it to scan through a website recursively. It only logs the URL I defined as staring point in console / node stdout.
This is my index.js:
const {SiteChecker} = require('broken-link-checker');
let options = {
acceptedSchemes: ['http', 'https'],
honorRobotExclusions: false,
cacheResponses: false
},
customData = null,
siteUrl = new URL('https://www.example.com');
const siteChecker = new SiteChecker(options)
.on('error', (error) => {
})
.on('robots', (robots, customData) => {
})
.on('html', (tree, robots, response, pageURL, customData) => {
console.log(pageURL.href)
})
.on('queue', () => {
})
.on('junk', (result, customData) => {
})
.on('link', (result, customData) => {
})
.on('page', (error, pageURL, customData) => {
})
.on('site', (error, siteURL, customData) => {
console.log(siteURL.href)
})
.on('end', () => {
console.log('Done!')
});
siteChecker.enqueue(siteUrl, customData);
To Reproduce
- Add
broken-link-checker
from github viayarn add
- Build it via
yarn build
innode_modules/broken-link-checker
- Create an
index.js
in project root and copy and paste my example mentioned above - Run
node index.js
in command line
Expected behavior
A list of URLs based on the given URL as starting point like this:
https://www.example.com
https://www.example.com/2017/12/08/kalte-winterdaemmerung-am-rheinfall/
https://www.example.com/author/johndoe/
https://www.example.com/2017/11/12/konzert-kammgarn/
https://www.example.com/2017/10/18/portrait-shooting/
https://www.example.com/2017/10/15/wochenendtrip/
https://www.example.com/2017/10/01/zu-besuch/
https://www.example.com/2017/06/29/gewitterfront/
https://www.example.com/2017/06/15/la-belle-paris/
https://www.example.com/2017/03/13/alvaro-soler/
...
Environment:
- macOS 10.15.3 (19D76) Catalina
- Node.js version: v12.16.1
- broken-link-checker version: 0.8.0 (read from package.json)
change
acceptedSchemes: ['http', 'https'],
to
acceptedSchemes: ['http:', 'https:'],
Perhaps this should be handled in the options parser to simplify the API.
One should read the manual carefully… 🤦🏼♂️
Thanks for the hint, now it works as expectet.