danfickle/openhtmltopdf

embed PDF file via object tag?

dbellingroth opened this issue · 8 comments

Hi. Is it possible to embed an existing PDF document (a technical drawing in our case) into the generated one? We need it to be embedded like an image not as a separate page. One thing I thought of was using the object tag. But it doesn't seem to work:

<object data="embedded.pdf"></object>

If this feature isn't available yet, is it possible to implement a custom FSObjectDrawer that is capable of doing it?

syjer commented

hi @dbellingroth , the feature is present in the openhtmltopdf-objects module.

You need add the dependency:

<dependency>
  <groupId>com.openhtmltopdf</groupId>
  <artifactId>openhtmltopdf-objects</artifactId>
  <version>0.0.1-RC19</version>
</dependency>

Then, you need to register the object drawer (note: there is a default one provided by openhtmltopdf-objects which register the jfreechart integration (https://github.com/danfickle/openhtmltopdf/blob/open-dev-v1/openhtmltopdf-objects/src/main/java/com/openhtmltopdf/objects/StandardObjectDrawerFactory.java)).

Like that:

package test;

import com.openhtmltopdf.objects.pdf.MergeBackgroundPdfDrawer;
import com.openhtmltopdf.pdfboxout.PdfRendererBuilder;
import com.openhtmltopdf.render.DefaultObjectDrawerFactory;

import java.io.File;
import java.io.FileOutputStream;
import java.io.OutputStream;

public class App {
    public static void main(String[] args) throws Exception {

        try (OutputStream os = new FileOutputStream("test.pdf")) {
            PdfRendererBuilder builder = new PdfRendererBuilder();
            builder.withFile(new File("test.html"));
            builder.useObjectDrawerFactory(new OnlyPdfBgDrawerFactory());
            builder.useFastMode();
            builder.toStream(os);
            builder.run();
        }
    }

    public static class OnlyPdfBgDrawerFactory extends DefaultObjectDrawerFactory {

        public static void registerStandardObjects(DefaultObjectDrawerFactory factory) {
            factory.registerDrawer("pdf/background", new MergeBackgroundPdfDrawer());
        }

        public OnlyPdfBgDrawerFactory() {
            registerStandardObjects(this);
        }
    }
}

The test.html is the following:

<html>
<body>

<h1>Hello world</h1>

<object type="pdf/background" pdfsrc="img.pdf" style="height:400px;width:400px;border:1px solid red"></object>

</body>
</html>

notice the type: "pdf/background" and pdfsrc="img.pdf", you must specify the size or else it will not be rendered.

This will work as you expect.

I've noticed unfortunately that https://github.com/danfickle/openhtmltopdf/blob/open-dev-v1/openhtmltopdf-objects/src/main/java/com/openhtmltopdf/objects/pdf/MergeBackgroundPdfDrawer.java does not resize the embedded pdf, I guess it could be improved?

edit: looking at the code (and class name) I guess it's for adding a pdf as a background more than an inline elements. But it may be a good base for writing a custom one?

Thanks @syjer your tip was a good starting point for writing a custom ObjectDrawer. I managed to embed a pdf file on a specific page. The problem I'm struggling with now is the correct positioning and scaling of the pdf.

Do you have any suggestions for me how to convert the x, y, width and height values I get in the drawObject method into a corresponding AffineTransform for placement in the PDF?

This is what I have at the moment:

package filters.pdf;

import com.google.common.base.Charsets;
import com.openhtmltopdf.css.style.CssContext;
import com.openhtmltopdf.extend.FSObjectDrawer;
import com.openhtmltopdf.extend.OutputDevice;
import com.openhtmltopdf.pdfboxout.PdfBoxOutputDevice;
import com.openhtmltopdf.render.RenderingContext;
import org.apache.pdfbox.cos.COSArray;
import org.apache.pdfbox.cos.COSName;
import org.apache.pdfbox.cos.COSStream;
import org.apache.pdfbox.io.RandomAccessBuffer;
import org.apache.pdfbox.multipdf.LayerUtility;
import org.apache.pdfbox.pdfparser.PDFParser;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.pdmodel.PDPageContentStream;
import org.apache.pdfbox.pdmodel.graphics.form.PDFormXObject;
import org.w3c.dom.Element;

import java.awt.*;
import java.awt.geom.AffineTransform;
import java.io.IOException;
import java.io.OutputStream;
import java.util.Map;

public class PdfObjectDrawer implements FSObjectDrawer {

    @Override
    public Map<Shape, String> drawObject(Element e, double x, double y, double width, double height, OutputDevice outputDevice, RenderingContext ctx, int dotsPerPixel) {

        if (outputDevice instanceof PdfBoxOutputDevice) {

            final String path = e.getAttribute("data");
            final byte[] data = ctx.getUac().getBinaryResource(path);


            // TODO: PDF zeichnen
            try {
                final PdfBoxOutputDevice pdfBoxOutputDevice = (PdfBoxOutputDevice) outputDevice;
                LayerUtility layerUtility = new LayerUtility(pdfBoxOutputDevice.getWriter());
                final PDFParser pdfParser = new PDFParser(new RandomAccessBuffer(data));
                pdfParser.parse();
                final PDFormXObject pdFormXObject = layerUtility.importPageAsForm(pdfParser.getPDDocument(), 0);
                pdfParser.getPDDocument().close();

                PDPage page = pdfBoxOutputDevice.getPage();

                pdFormXObject.setMatrix(calculateTransform(ctx, pdfBoxOutputDevice, x, y, width, height, dotsPerPixel, pdFormXObject));

                layerUtility.wrapInSaveRestore(page);

                COSArray cosArray = (COSArray) page.getCOSObject().getDictionaryObject(COSName.CONTENTS);
                COSStream saveStateAndPlacePageBackgroundStream = (COSStream) cosArray.get(0);
                OutputStream saveAndPlaceStream = saveStateAndPlacePageBackgroundStream.createOutputStream();

                saveAndPlaceStream.write("q\n".getBytes(Charsets.US_ASCII));

                COSName name = page.getResources().add(pdFormXObject);
                name.writePDF(saveAndPlaceStream);


                saveAndPlaceStream.write(' ');
                saveAndPlaceStream.write("Do\n".getBytes(Charsets.US_ASCII));
                saveAndPlaceStream.write("Q\n".getBytes(Charsets.US_ASCII));
                saveAndPlaceStream.write("q\n".getBytes(Charsets.US_ASCII));

                saveAndPlaceStream.close();

                if (false) {
                    throw new IOException();
                }

            } catch (IOException e1) {
                e1.printStackTrace();
            }

            return null;
        } else throw new RuntimeException("This feature only works with PdfOutputDevice");
    }

    private AffineTransform calculateTransform(RenderingContext ctx, PdfBoxOutputDevice pdfBoxOutputDevice, double x, double y, double width, double height, int dotsPerPixel, PDFormXObject pdFormXObject) {
        final float pageWidth = pdfBoxOutputDevice.getPage().getBBox().getWidth();
        final float pageHeight = pdfBoxOutputDevice.getPage().getBBox().getHeight();

        final float sourceWidth = pdFormXObject.getBBox().getWidth();
        final float sourceHeight = pdFormXObject.getBBox().getHeight();

        final float elementWidth = (float) (width / dotsPerPixel);
        final float elementHeight = (float) (height / dotsPerPixel);

        AffineTransform scaleTransform = AffineTransform.getScaleInstance(
                elementWidth / sourceWidth,
                elementHeight / sourceHeight
        );

        final float elementX = (float) (x / dotsPerPixel);
        final float elementY = (float) ((y - ctx.getPage().getTop()) / dotsPerPixel);

        final float xTranslation = elementX;
        final float yTranslation = pageHeight - elementY - elementHeight;

        final AffineTransform translateTransform = AffineTransform.getTranslateInstance(xTranslation, yTranslation);

        final AffineTransform finalTransform = translateTransform;
        finalTransform.concatenate(scaleTransform);
        return finalTransform;
    }
}
syjer commented

hi @dbellingroth , I did some tests on my side, but currently I'm not able to have a universal solution, the reported BBox from the pdfFormXObject seems to be inconsistent so I'm not even sure that the scaling will be correct (as a test, I created a 100x100px image and created a pdf from it, the reported size from the bbox is 75...).

Btw, for the scaling, you need to keep the aspect ratio, so something like that will be more correct:

final float ratio = Math.min(elementWidth/sourceWidth, elementHeight/sourceHeight);
final float adjustedHeight = sourceHeight * ratio;
final float adjustedWidth = sourceWidth * ratio;

AffineTransform scaleTransform = AffineTransform.getScaleInstance(adjustedWidth / sourceWidth,adjustedHeight / sourceHeight);

But unfortunately, the reported dimensions does not seems to be coherent enough. I'll do some additional tries and let you know if I'm able to find a better solution...

hi @dbellingroth , I did some tests on my side, but currently I'm not able to have a universal solution, the reported BBox from the pdfFormXObject seems to be inconsistent so I'm not even sure that the scaling will be correct (as a test, I created a 100x100px image and created a pdf from it, the reported size from the bbox is 75...).

Btw, for the scaling, you need to keep the aspect ratio, so something like that will be more correct:

final float ratio = Math.min(elementWidth/sourceWidth, elementHeight/sourceHeight);
final float adjustedHeight = sourceHeight * ratio;
final float adjustedWidth = sourceWidth * ratio;

AffineTransform scaleTransform = AffineTransform.getScaleInstance(adjustedWidth / sourceWidth,adjustedHeight / sourceHeight);

But unfortunately, the reported dimensions does not seems to be coherent enough. I'll do some additional tries and let you know if I'm able to find a better solution...

Hi @syjer. Did you find any solution to this?

syjer commented

hi @dbellingroth unfortunately no.

Hi everybody,

I just added some code on the replaced_sizing branch to add a pdf page:

<img src="document.pdf" page="2" alt="My PDF Page" />

It can then be sized using regular CSS on the img tag. Unfortunately, I still have to do some work on that branch around link maps before I can merge it with the main branch. Also, for some reason, the width of the PDF page inserted is a couple of pixels off. I'll keep you updated.

Thanks everyone.

This feature has now been released with version 1.0.0.

It appears this functionality doesn't support annotations. Is that correct?