ffalt/pdf.js-extract

normalizeWhitespace parameter for getTextContent() option

alexschwantes opened this issue · 2 comments

Can we expand the options so that getTextContent() can run with the normalizeWhitespace parameter?

return page.getTextContent().then(function (content) {

so it would look like :

const normalizeWhitespaceParam = options && options.normalizeWhitespace === true ? true : false;
return page.getTextContent({ normalizeWhitespace: normalizeWhitespaceParam }).then(function (content) {	

source: https://github.com/mozilla/pdf.js/blob/b2e7d0c89b76e228e49c7cee759873322a442f62/src/display/api.js#L779
thanks

ffalt commented

Done in v0.1.0
I had no idea such options existed, thanks for bringing it to my attention.
Also included disableCombineTextItems. (both are now documented in the README)

awesome, thanks!