typst/svg2pdf

Make converted SVGs use up less space

LaurenzV opened this issue · 0 comments

Currently, in cases of bigger SVGs, svg2pdf will often create SVGs that take up much more space than they should.

Consider the following SVG:
test2

The SVG itself is 770KB. Once converted with svg2pdf, it takes about 568KB of space. However, cloud converters manage to get it down to around ~220KB.

The reason for that is twofold: One the one hand, we create way too many XObjects, meaning that we can't make full use of the compression using Deflate. Not only that, but we also sometimes create XObject that are completely empty. Here is an example from the SVG above:

12 0 obj
<<
  /Length 3
  /Type /XObject
  /Subtype /Form
  /Resources <<
    /ColorSpace <<
      /srgb [/CalRGB <<
        /WhitePoint [0.9505 1 1.0888]
        /Gamma [2.2 2.2 2.2]
        /Matrix [0.4124 0.2126 0.0193 0.3576 0.715 0.1192 0.1805 0.0722 0.9505]
      >>]
    >>
    /ProcSet [/PDF /ImageC /ImageB]
  >>
  /Group <<
    /Type /Group
    /S /Transparency
    /I true
    /K false
    /CS [/CalRGB <<
      /WhitePoint [0.9505 1 1.0888]
      /Gamma [2.2 2.2 2.2]
      /Matrix [0.4124 0.2126 0.0193 0.3576 0.715 0.1192 0.1805 0.0722 0.9505]
    >>]
  >>
  /BBox [46.428413 367 545.4284 966]
  /Matrix [0.18334149 0 0 0.18334149 163.40727 -242.72272]
>>
stream
q
Q
endstream
endobj

Something like this could be avoided if we checked wether a usvg::Group actually contains paths/children before creating an XOBject for that. However, the main issue is still that we have so many XObjects which all have the Group and ColorSpace entry. which bloats the PDF unnecessarily.

Some things we could do to improve this:

  • Improvements in pdf-writer: I think there are a couple of improvements we could implement in pdf-writer to make files smaller. For example, just by removing the indent feature, I was able to get the file down from 568KB to 526KB. The PDF looks a bit uglier this way when opening it in a text editor, but this shouldn't really be a concern, since they are not meant to be read by humans anyway.
  • Before creating an XObject, check whether it actually contains paths that need to be drawn.
  • Only add the ColorSpace and Transparency Group attributes to XObjects that actually need it. This will require some thinking, but it will improve the space requirements by a lot.
  • Avoid allocating so many XObjects. This will be the most difficult part to implement, as it will require us to keep track of some transformations in svg2pdf and resorting to mainly using the cm operator to apply transformations instead of relying on the Matrix attribute of XObjects. But this will make everything much more space efficient. And if we manage to implement this, the 4 points mentioned above could probably be completely ignored as they won't have such a big effect anymore.