[Bug]: Puppeteer PDF generation via Page.pdf() method
Closed this issue · 5 comments
Bug description
I was creating a simple html to pdf converter when I stumbled into a little issue with all the current version of the fork.
In the current implementation of the Page.pdf method in the @cloudflare/puppeteer fork, the Readable stream returned by Page.createPDFStream is converted to a Buffer by the function getReadableAsBuffer in puppeteer src/common /util.ts
, where the problem is at. The getReadableAsBuffer function tries to iterate over a non iterable object(node:stream/Readable), what causes an TypeError: readable is not async iterable
exception.
It can be easily solved by using Page.createPDFStream directly, but is still an issue which is not present in the @puppeteer/puppeteer-core package.
Steps to reproduce the problem:
- Launch puppeteer and instantiate a Page with some content:
const browser = await puppeteer.launch(env.MYBROWSER);
const page = await browser.newPage();
await page.setContent('<h1>HELLO CLOUDFLARE</h1>', {
waitUntil: 'networkidle0',
});
- Get the PDF:
await page.pdf({ displayHeaderFooter: true })
Below is the link of a repo to reproduce the issue.
https://github.com/GiovaniMFMurari/cf-puppeteer-pdf-gen-test
Puppeteer version
0.0.6
Node.js version
16, 18 and 20
npm version
using pnpm 8.15.5
What operating system are you seeing the problem on?
Linux
Relevant log output
✘ [ERROR] Uncaught (in response) TypeError: readable is not async iterable
at getReadableAsBuffer
at Page.pdf
at async Object.fetch
at async drainBody
Getting this too! Any update?
UPDATE: here's a working workaround. createPDFStream() seems to work, which you can then collect into a buffer.
export default async function markdownToR2Pdf(env: Env) {
const settings = await getSettings(env);
// Define test HTML
let html = `
<html>
<body>
<h1>Hello World</h1>
</body>
</html>
`;
// Launch browser
const browser = await puppeteer.launch(env.BROWSER);
// Create new page
const page = await browser.newPage();
// Set page content
await page.setContent(html);
// Wait for network idle
await page.waitForNetworkIdle();
// Generate PDF
const pdfStream = await page.createPDFStream({
format: "A4",
printBackground: true,
});
// Collect PDF data into a buffer
const chunks: Uint8Array[] = [];
return new Promise<string>((resolve, reject) => {
pdfStream.on("data", (chunk: Uint8Array) => chunks.push(chunk));
pdfStream.on("end", async () => {
const pdfBuffer = Buffer.concat(chunks);
console.log("PDF buffer created");
// Upload PDF to R2 bucket
const objectName = `pdf_${Date.now()}.pdf`;
await env.SALES_FILES_BUCKET.put(objectName, pdfBuffer);
console.log("PDF uploaded to R2 bucket");
await page.close();
await browser.close();
const pdfUrl = `${settings.salesFilesBucketUrl}/${objectName}`;
resolve(pdfUrl);
});
pdfStream.on("error", (error) => {
reject(error);
});
});
}
@GiovaniMFMurari and @emilthemaker we released an update recently and we can't reproduce this anymore, can you give it another try ?
We are closing this, feel free to reopen if you still get this error with the recent versions
This bug still exists in 0.0.13, throwing the same error:
This works:
function streamToBuffer(stream: NodeJS.ReadableStream) {
return new Promise<Buffer>((resolve, reject) => {
const chunks: any[] = [];
stream.on('data', (chunk) => chunks.push(chunk));
stream.on('end', () => resolve(Buffer.concat(chunks)));
stream.on('error', (err) => reject(err));
});
}
const stream = await page.createPDFStream();
const buffer = await streamToBuffer(stream);
This fails with TypeError: readable is not async iterable
const buffer = await page.pdf();