URL is not accepted.
Opened this issue · 0 comments
skiwheelr commented
When I run the seed file with the public google URL added, I get the following error:
Phonecian:PaperChaser skiwheelr$ node paperchaser.js crawl seed-file.txt
node:internal/process/promises:246
triggerUncaughtException(err, true /* fromPromise */);
^
TypeError [ERR_INVALID_URL]: Invalid URL
at new NodeError (node:internal/errors:363:5)
at onParseError (node:internal/url:536:9)
at new URL (node:internal/url:612:5)
at /Users/skiwheelr/PaperChaser/libs/parsing.js:135:28
at Array.map (<anonymous>)
at Object.get_ids_from_urls (/Users/skiwheelr/PaperChaser/libs/parsing.js:124:37)
at Command.<anonymous> (/Users/skiwheelr/PaperChaser/paperchaser.js:52:34)
at Command.listener [as _actionHandler] (/Users/skiwheelr/PaperChaser/node_modules/commander/lib/command.js:473:17)
at /Users/skiwheelr/PaperChaser/node_modules/commander/lib/command.js:1173:65
at Command._chainOrCall (/Users/skiwheelr/PaperChaser/node_modules/commander/lib/command.js:1091:12) {
input: '',
code: 'ERR_INVALID_URL'
}
If I replace the URL string with random text (e.g. boogieman) it does record the string.
input: 'boogieman',
code: 'ERR_INVALID_URL'
}
It seems to return an empty string as input only IF it is a google drive link.
Is there some structure the seed-text file should follow or is it return separated raw url text?