Acemap/pdf_parser

Unable to extract all images from PDF

Opened this issue · 0 comments

When I use the pdffigures2 backend to extract images from a PDF, there are often images that are overlooked. For example, pdf_parser extracts only 3 images from a PDF file that contains 5 images. (In fact, in my observation, pdffigures2 is the best of the three image extraction backends, cermine will cut a complete image into pieces.)
I guess maybe the pdffigures2 backend uses default parameters such as "image size" or "resolution" to filter the images?
Can you give me some advice or clues?
Thank you for your assistance.