nisaacson/pdf-text-extract

problem with crop

Closed this issue · 3 comments

hey guys, I'm trying your module, I need crop just part of a page

I used

const options = {
  crop: {
    x: 0,
    y: 100,
    w: 335,
    h: 465,
  },
}

extract(filePath, options, function (err, pages) {
  if (err) {
    console.dir(err)
    return
  }
  console.dir(pages)
})

but the crop dont works, I receive the text of entire page

I'm using the crop wrong? Please, guide me about it

I found the error, is the following

index.js exports a function

function pdfTextExtract (filePath, options, pdfToTextCommand, cb) {
  /* codes here */
  if (typeof (pdfToTextCommand) === 'function') {
    cb = pdfToTextCommand
    pdfToTextCommand = 'pdftotext'
    options = {} // line 19
  }
}

the line 19, overwite options, if the callback is passed, so I just comment this line, and the crop works

However, I fork the project to do a PR, but, when I clone, I receive the version 1.3.1 (almost in package.json)

Is strange, the index.js is differente from release 1.4.1 (the module downloaded by npm)

How I can do a PR right?

@darlanmendonca the 1.4.1 missing from master is now fixed. See issue #19. If you are still looking at this a PR should work as intended

Hi, I too am trying this crop function which would be very useful.

I cannot get darlanmendonca's solution to work.

The last post by TechplexEngineer seems to indicate that the fix is now in the code?

What units are being used for the co-ordinates - pixels?

If I try the commenting out idea for line 19 I often get an error from the basic code at console.write(err)

var path = require('path')
var filePath = path.join(__dirname, './Eng_Report.pdf')

var extract = require('pdf-text-extract')

const options = {
crop: {x: 1, y: 1, w: 200, h: 200} //not working
};

extract(filePath, options, function (err, pages) {
if (err) {
console.dir(err)
return
}
console.log(pages[0])
})

What is supposed to happen if no text is found within the crop area?

Thanks