esbenp/pdf-bot

Support for posting the html instead of adding a url

Hagbarth opened this issue · 6 comments

I think it would be reasonable to make it possible to generate the pdf's directly from html instead of from visiting a url.

Not sure how html-pdf-chrome handles this, but Puppeteer has support out of the box, so it should be possible.

I'm yet to play around with this, but it could potentially work as-is if you converted the HTML to a data URI string and used that as the URL.

This might be a terrible suggestion, but occurred to me that it could work. :)

Actually if you paste a non-url to html-pdf-chrome it will just render it as html. However, pdf-bot validates the input and throws an error if you post a non-url. Maybe a new endpoint for non-urls or a meta-type key would be the way to go here.

So either POST / for urls POST /html for html, or something like

POST /
{"url":"http://google.com"}
POST /
{"html":"<b>html</b>"}

@esbenp okay

POST /html is probably best, so we don't run into the issue of having to handle both a url and an html string in the same request. Although I guess the two could be pushed to the queue separately?

I think either is a valid option. Personally I like a fewer, configurable endpoints. I will look into this soon

Dexus commented

You need only check

const url = /^(https?|file|data):/i.test(html) ? html : `data:text/html,${html}`;

and open the URL... I think it would not be difficult to implement it.

const url = data:text/html,${ html }

Actually if the html is really vast "PayloadTooLargeError: request entity too large" error can happen. Probably it might be solved by putting some configs for large requests within pdf-bot api express app.