Unable to extract all images from PDF
Davidwhw opened this issue · 0 comments
Davidwhw commented
Sorry to bring up the same issue as pdf_parser/issues/1#issue-2307687422, because I have not received a reply and urgently need a solution.
When I use the pdffigures2
backend to extract images from a PDF, there are often images that are overlooked. For example, pdf_parser extracts only 3 images from a PDF file that contains 5 images. (In fact, in my observation, pdffigures2
is the best of the three image extraction backends, cermine
will cut a complete image into pieces.)
I guess maybe the pdffigures2
backend uses default parameters such as "image size" or "resolution" to filter the images?
Can you give me some advice or clues?
Thank you for your assistance.