ralfstuckert/pdfbox-layout

Not optimal ImageElement

Opened this issue · 0 comments

A few points about images

  1. ImageElement based on BufferedImage that stores decoded (uncompressed) image bitmap. It could occupy more then 50MB per image. And doesn't matter if it is JPEG or PNG.
  2. To create PDImageXObject is used LosslessFactory. In this case compression not optimal at least for JPEG types.

Proposal:

  1. Update ImageElement to use image source inputs stream or byte array (compressed).
  2. To detect width and height just use ImageIO.getImageReaders(is)``, ``reader.getWidth(0), reader.getHeight(0)
  3. Then on draw call PDImageXObject.createFromByteArray() - new method in PDFBox 2.0.8

Note:
createFromByteArray() - it also not optimal due to reading all stream (one time) instead of parse only header to detect color space. Anyway stream / byte array is compressed it is better than BufferedImage