[BUG] Huge memory consumption when writing images to PDF
zenyui opened this issue · 8 comments
Description
I am trying to create a PDF from an array of golang image.Image
objects. The images are about ~30MB together, and when I write them to the PDF, I observe the docker container spike to 1.4GB memory usage.
In production, this is causing my container to OOM and exit.
See implementation below.
Expected Behavior
I would expect the memory usage to be close to (or 2x, 3x) the size of the images, not 1.4GB! I also don't see a way to incrementally build/finalize the PDF, so I don't see a way to decrease the memory usage.
Actual Behavior
Memory usage is 1.4GB, and I don't see an avenue to accomplish what I'm hoping to do.
Attachments
// pdfFromGoImages creates a pdf from an array of images, each on a separate page
func pdfFromGoImages(ctx context.Context, images ...image.Image) (io.ReadSeeker, error) {
c := creator.New()
margins := float64(10)
for ix, img := range images {
pImg, err := c.NewImageFromGoImage(img)
if err != nil {
return nil, err
}
_ = c.NewPage()
// scale to page width
pImg.ScaleToWidth(c.Width() - margins*2)
pImg.SetPos(margins, margins)
if pImg.Height() >= c.Height() {
pImg.ScaleToHeight(c.Height() - margins*2)
pImg.SetPos(margins, margins)
}
b := creator.NewBlock(1, 1)
if err := b.Draw(pImg); err != nil {
return nil, err
}
if err := c.Draw(b); err != nil {
return nil, err
}
}
var outBytes bytes.Buffer
writer := bufio.NewWriter(&outBytes)
if err := c.Write(writer); err != nil {
return nil, err
}
return bytes.NewReader(outBytes.Bytes()), nil
}
Welcome! Thanks for posting your first issue. The way things work here is that while customer issues are prioritized, other issues go into our backlog where they are assessed and fitted into the roadmap when suitable. If you need to get this done, consider buying a license which also enables you to use it in your commercial products. More information can be found on https://unidoc.io/
FYI, I am a licensed enterprise customer
Hi @zenyui,
Could you share the images that you load into golang image.Image
object? so we can reproduce the issue in our ends
Here is a google drive folder with a few pprof dumps and the source PDF.
The larger algorithm is:
- extract the images from the source pdf
- convert to golang image.Image and compress it to 75% quality (attempt to make it smaller)
- pass into above function to write images to a new PDF
Here is a google drive folder with a few pprof dumps and the source PDF.
The larger algorithm is:
- extract the images from the source pdf
- convert to golang image.Image and compress it to 75% quality (attempt to make it smaller)
- pass into above function to write images to a new PDF
Thanks for the information, we will investigate this issue.
Still waiting on a solution.
@zenyui
We have already improved partly PDF creation from images and introduced lazy mode allowing us to reduce memory consumption.
you can check it here:
https://github.com/unidoc/unipdf-examples/blob/master/image/pdf_images_to_pdf_lazy.go
As for image extraction, we are actively working on that and and we will keep you updated on our progress.