mileszs/wicked_pdf

Random "Error: PDF could not be generated!" after upgrade from 2.1 to 2.6.3

adriancb opened this issue ยท 11 comments

First of all, thank you for your work on maintaining this gem for the community! ๐Ÿ™‡

Issue description

After an upgrade from 2.1 to 2.6.3 we're experiencing random PDF generation failures.

Calling:

WickedPdf.new.pdf_from_string(
        html,
        print_media_type: true,
        dpi: 300,
        footer: {
          font_name: '"Cerebri Sans", sans-serif',
          font_size: '8',
          left: footer_content.left,
          right: footer_content.right
        }
      )

randomly results in the following:

"/app/vendor/bundle/ruby/3.2.0/bin/wkhtmltopdf", "--print-media-type", "--dpi", "300", "--footer-font-name", "\"Cerebri Sans\", sans-serif", "--footer-left", "4/6/2023 08:46:23 โ€” REDACTED", "--footer-right", "Page [page] / [topage]", "--footer-font-size", "8", "file:////tmp/wicked_pdf20230603-12-8u9ivi.html", "/tmp/wicked_pdf_generated_file20230603-12-pzdsgj.pdf"]
Error: PDF could not be generated!
 Command Error: 

Retrying the same, with the same content, can result in success, hence the random nature of the issue.

Our configuration is:

WickedPdf.config = {
  enable_local_file_access: true
}

I'd be happy to share more information, but I need to figure out where to start and what's helpful.

Expected or desired behavior

Consistent PDF generation.

System specifications

wicked_pdf gem version (output of cat Gemfile.lock | grep wicked_pdf): wicked_pdf (2.6.3)

wkhtmltopdf version (output of wkhtmltopdf --version): 0.12.6.1

whtmltopdf provider gem and version if one is used: https://github.com/zakird/wkhtmltopdf_binary_gem

platform/distribution and version (e.g. Windows 10 / Ubuntu 16.04 / Heroku cedar): heroku-22

Having same error here, without enable_local_file_access option, with no local file access, resulting blocked file access error.
I am currently doubting S3 bucket request rate exceeding.

Does reverting solve your problem? Did you also upgrade wkhtmltopdf at the same time? If you did, I would bet that's the real cause.

Something I've always tried to do is make sure my assets are always local files before PDF generation, and nothing is being fetched over the network in real-time.

Any update on this? Would an update to 2.7.0 or 2.8.0 fix this?

@staffler-xyz Possibly, but probably not. Have you tried yet?

Please provide more information if you can. What is the actual error? Do you get a stacktrace? What version are you running? Is it also random (like it works most of the time)?

I can't think of any good reason this happens randomly, unless your system is getting rid of tempfiles that are still in-use, or you have system limits that prevent too many wkhtmltopdf instances from running simultaneously.

hi, I get this error:

["/app/vendor/bundle/ruby/3.1.0/gems/wkhtmltopdf-heroku-2.12.6.0/bin/wkhtmltopdf-linux-amd64", "--disable-smart-shrinking", "--margin-top", "0", "--margin-bottom", "0", "--margin-left", "0", "--margin-right", "0", "--footer-spacing", "-30", "--footer-html", "file:////tmp/wicked_footer_pdf20240315-2-pag8lr.html", "file:////tmp/wicked_pdf20240315-2-gicnwn.html", "/tmp/wicked_pdf_generated_file20240315-2-1mghdn.pdf"]
Error: PDF could not be generated!
Command Error: Loading pages (1/6)
[>                                                           ] 0%
[======>                                                     ] 10%
[=======================>                                    ] 39%
[=========================>                                  ] 43%
[===========================>                                ] 45%
[============================>                               ] 47%
[=============================>                              ] 49%
[==================================>                         ] 57%
[=====================================>                      ] 63%
[=========================================>                  ] 69%
[=============================================>              ] 75%
[================================================>           ] 80%
[==================================================>         ] 84%
[============================================================] 100%
Counting pages (2/6)                                              
[============================================================] Object 1 of 1
Resolving links (4/6)                                                      
[============================================================] Object 1 of 1
Loading headers and footers (5/6)                                          
[>                                                           ] 1%
Warning: Blocked access to file    
[=&gโ€ฆ

(unfortunately the output is truncated at the end)

wicked_pdf gem version (output of cat Gemfile.lock | grep wicked_pdf): wicked_pdf (2.6.3)
wkhtmltopdf version (output of wkhtmltopdf --version): 0.12.6.0
whtmltopdf [provider gem]: wkhtmltopdf-heroku (2.12.6.0)
platform/distribution and version: heroku-20

@staffler-xyz Does it always die out like this? Is it random, or every time for this specific PDF generation?

This looks like wkhtmltopdf is dying early on while setting up the headers and footers (which you have), and is also complaining that it can't load a file (which is default behavior in newish versions of wkhtmltopdf, but can be "fixed" by adding enable_local_file_access as true to the WickedPDF config.

Try either enabling that option, or disabling custom headers and footers, or some combination of both. Also make sure your footer is a complete and valid HTML document, including doctype and <html><head></head><body>.

it happens randomly, usually it works after reloading the page. But it also fails in the background with ActiveJob and WickedPdf.new.pdf_from_string.

We don't use local files, only images from CDN.

This is our layout (simplified):

<!DOCTYPE html>
<html>
<head>
  <meta charset='UTF-8' />
  <%= stylesheet_link_tag("pdf", :media => "all") %>
  <style type="text/css">
    <%= yield(:styles) %>
  </style>
  <%= javascript_tag do %>
    function number_pages() {
      <javascript code>
    }
  <% end %>
</head>
<body onload="number_pages()">
  <%= yield %>
</body>
</html>

I noticed that this might be the problem:

<%= stylesheet_link_tag("pdf", :media => "all") %>

Should it be replaced with:

<%= wicked_pdf_stylesheet_link_tag("pdf", :media => "all") %>

Did anyone solve this successfully?

Is it possible that generating the PDF is exceeding Heroku's memory limit, and thus getting killed off before it finishes?

I don't think so, memory is 16GB RAM and the application is running at about 1GB RAM (no spikes when running wkhtmltopdf commands)

If possible, try downloading remote resources from the CDN into local files or tempfiles or base64, and loading that instead.
This is what I've always done and advocated for, though I'm sure many others do not and it works fine for their use-cases.

I don't have any solid reasons for this, but experience & casually googling wkhtmltopdf dies sometimes turns up numerous issues, like this one where they needed to increase the JS delay, this one where they suggested it was opening too many files or running out of swap space, or this one where the network stack dies while downloading assets.

ok, thanks for your hints. I will check again and post the solution here, if found.