Unable to extract all images from PDF
Opened this issue · 0 comments
Davidwhw commented
When I use the pdffigures2
backend to extract images from a PDF, there are often images that are overlooked. For example, pdf_parser
extracts only 3 images from a PDF file that contains 5 images. (In fact, in my observation, pdffigures2
is the best of the three image extraction backends, cermine
will cut a complete image into pieces.)
I guess maybe the pdffigures2
backend uses default parameters such as "image size" or "resolution" to filter the images?
Can you give me some advice or clues?
Thank you for your assistance.