RFC - Roadmap for version 1

Question

RFC - Roadmap for version 1

danfickle opened this issue 7 years ago · 20 comments

Answer 1 · 2018-01-30T18:22:11.000Z

The very first thing I would suggest: add instructions for non-Maven users to the Integration Guide: I myself use Ant (from Netbeans). This includes making available a full list of dependencies (PDFBox, and whatever misc .jar it's required currently - for example graphics2d which took me for surprise during my initial tests)

Obviously we need to get v1 out first properly so we can have downloadable releases and the like.

Also I second the logging overhaul.

Answer 2 · 2018-02-01T15:10:47.000Z

It seems that there are some issure with transparent embedded svg, The transparent background will be displayed in black. The same issure occured when I use batik to convert svg to bmp myself. However, there is no problem when converting to png. The batik does not provide a converter for transforming svg to bmp, so I write a custom one according that for transforming svg to png. In the end, I decide link the extern png as a workaround.

Answer 3 · 2018-02-01T15:12:58.000Z

@vipcxj Can you share the SVG which does not work for you with me? I would like to fix this bug (which is in https://github.com/rototor/pdfbox-graphics2d, as there the whole Graphics2d->PDF mapping is happening).

Answer 4 · 2018-02-01T15:34:12.000Z

I will give you the SVG when I go to work tomorrow. It's a watermark
update: this is the content of the svg.

<?xml version="1.0" encoding="UTF-8" ?>
<svg width="512" height="512" version="1.1" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
  <style type="text/css">text { fill: gray; font-family: Avenir, Arial, Helvetica, sans-serif; }</style>
  <defs>
    <pattern id="twitterhandle" patternUnits="userSpaceOnUse" width="400" height="200">
      <text y="30" font-size="40" id="name">TEST WATERMARK</text>
    </pattern>
    <pattern xlink:href="#twitterhandle">
      <text y="120" x="200" font-size="30" id="occupation">test watermark</text>
    </pattern>
    <pattern id="combo" xlink:href="#twitterhandle" patternTransform="rotate(-45)">
      <use xlink:href="#name" />
      <use xlink:href="#occupation" />
    </pattern>
  </defs>
  <rect width="100%" height="100%" fill="url(#combo)" />
</svg>

Answer 5 · 2018-02-02T17:53:23.000Z

@vipcxj I've just released pdfbox-graphics2d version 0.11 which fixes this problem. PdfBoxGraphics2D did not handle the PatternPaint of Batik SVG. You can manually depend on this version or wait till it is integrated here.

Answer 6 · 2018-02-03T17:06:12.000Z

I would be delighted to see some improvement to my issue #119 ... the proposed workaround using floating containers works nine times out of ten, but not as perfect as everything else in this amazing project (at least for me).

Regards
Bigdatha

Answer 7 · 2018-03-06T06:47:59.000Z

@dilworks haven't heard of anyone using either Ant or NetBeans in years...

Answer 8 · 2018-03-06T11:14:52.000Z

@achuinard uh... I do. And it's still quite popular here in Latin America.

Not everybody likes Maven or Eclipse, and there is nothing wrong with that.

Answer 9 · 2018-03-20T18:35:53.000Z

Just wondering: has anyone done performance benchmarks? As there are quite a lot of us looking at this project as a long-term replacement for good ol' FS+iText, matching the performance of that should be a goal.

I've only done some quick testing with simple reports (basically tables, nothing fancy), and I've found openhtmltopdf to be as much as 50% slower than FS+iText, and I have no clue on where could be the bottlenecks (here? in PDFBox?).

Answer 10 · 2018-03-21T06:20:07.000Z

@dilworks This is likely caused by PDFBox or its dependency FontBox. Are you using many custom fonts? FontBox is a little bit slow when parsing fonts...

Answer 11 · 2018-03-21T18:52:24.000Z

Well, my reports are very simple - I'm using the PDF defaults (Times, Helvetica), not even external ones!

Answer 12 · 2018-03-27T08:31:32.000Z

Thanks @dilworks

You inspired me to create a large document and run VisualVM while it was processing. It immediately highlighted a silly bug in the BIDI splitter which is now fixed (above). This was taking well over half the run-time. The next culprit to look at is createInlineBox. Any ideas on why that is so slow?

Before:

After:

Embarrassingly, the BIDI splitter should not even run when not configured, which I'll fix in a future commit.

Answer 13 · 2018-03-29T18:24:37.000Z

Is there any reason why this can't run on the modern Google App Engine Standard env Java8? It removes a ton of restrictions from the older java7 environment (no more whitelist of jars, most APIs should work).

I'd be happy to test if nobody has.

Answer 14 · 2018-03-29T18:41:23.000Z

Please test. I asked this days ago and never heard anything from Dan.

…

On Thu, Mar 29, 2018, 1:24 PM Rob46 ***@***.***> wrote: Is there any reason why this can't run on the modern Google App Engine Standard env Java8? It removes a ton of restrictions from the older java7 environment (no more whitelist of jars, most APIs should work). I'd be happy to test if nobody has. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#170 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAcqj-Fn_vhO466fOvuNa82ENr4oy-LKks5tjSbmgaJpZM4RfPHh> .

Answer 15 · 2018-04-02T15:55:34.000Z

@danfickle Good starting point. I did a few test runs with a 18-page test document (will try to clean it up from any proprietary/private info to provide a public test sample of the reports I generate) with nothing but default fonts and rather simple tables. 25 runs for each converter, measuring times (although not resource usage, but then, we rarely generate hundreds-of-pages reports so that represents one of my most frequent use cases)

Here are the test results:
benchmark_pdfgen.xlsx

So far, I've found the performance gap between FS and OH to be around 30%.

(LOL at GitHub that doesn't support OpenDocument documents!)

Now I'll try with the really heavy hundred-of-pages CPU-draining workloads :)

Answer 16 · 2018-04-02T16:10:21.000Z

Testcase:
testcase_fs_oh.tar.gz

Forgot to tell my setup: this is my dev laptop (a quite ancient Core 2 Duo P8600 with 6GB DDR2 RAM and 500GB of good ol' spinning rust storage) running both generators inside a J2EE container (WildFly 11.0)

Answer 17 · 2018-04-03T07:46:00.000Z

Thanks @dilworks

That is helpful, when I work on the collapse whitespace function tomorrow, I'll run your test case before and after and see if we can get a good improvement. Could we continue talk of performance improvements in #180?

Answer 18 · 2018-04-03T13:44:44.000Z

All right then!

And once again, thanks for improving the library!

Answer 19 · 2018-10-04T12:50:27.000Z

More docs please. More examples.

Great project though, works nicely for me, just would like to know all the things I can do and more importantly can't do

Answer 20 · 2019-06-13T06:27:14.000Z

Implementing flexbox layout (#69) will be a huge improvement