Make converted SVGs use up less space
LaurenzV opened this issue · 0 comments
Currently, in cases of bigger SVGs, svg2pdf
will often create SVGs that take up much more space than they should.
The SVG itself is 770KB. Once converted with svg2pdf
, it takes about 568KB of space. However, cloud converters manage to get it down to around ~220KB.
The reason for that is twofold: One the one hand, we create way too many XObjects, meaning that we can't make full use of the compression using Deflate. Not only that, but we also sometimes create XObject that are completely empty. Here is an example from the SVG above:
12 0 obj
<<
/Length 3
/Type /XObject
/Subtype /Form
/Resources <<
/ColorSpace <<
/srgb [/CalRGB <<
/WhitePoint [0.9505 1 1.0888]
/Gamma [2.2 2.2 2.2]
/Matrix [0.4124 0.2126 0.0193 0.3576 0.715 0.1192 0.1805 0.0722 0.9505]
>>]
>>
/ProcSet [/PDF /ImageC /ImageB]
>>
/Group <<
/Type /Group
/S /Transparency
/I true
/K false
/CS [/CalRGB <<
/WhitePoint [0.9505 1 1.0888]
/Gamma [2.2 2.2 2.2]
/Matrix [0.4124 0.2126 0.0193 0.3576 0.715 0.1192 0.1805 0.0722 0.9505]
>>]
>>
/BBox [46.428413 367 545.4284 966]
/Matrix [0.18334149 0 0 0.18334149 163.40727 -242.72272]
>>
stream
q
Q
endstream
endobj
Something like this could be avoided if we checked wether a usvg::Group
actually contains paths/children before creating an XOBject for that. However, the main issue is still that we have so many XObjects which all have the Group
and ColorSpace
entry. which bloats the PDF unnecessarily.
Some things we could do to improve this:
- Improvements in
pdf-writer
: I think there are a couple of improvements we could implement inpdf-writer
to make files smaller. For example, just by removing theindent
feature, I was able to get the file down from 568KB to 526KB. The PDF looks a bit uglier this way when opening it in a text editor, but this shouldn't really be a concern, since they are not meant to be read by humans anyway. - Before creating an XObject, check whether it actually contains paths that need to be drawn.
- Only add the
ColorSpace
andTransparency Group
attributes to XObjects that actually need it. This will require some thinking, but it will improve the space requirements by a lot. - Avoid allocating so many XObjects. This will be the most difficult part to implement, as it will require us to keep track of some transformations in
svg2pdf
and resorting to mainly using thecm
operator to apply transformations instead of relying on the Matrix attribute of XObjects. But this will make everything much more space efficient. And if we manage to implement this, the 4 points mentioned above could probably be completely ignored as they won't have such a big effect anymore.