sprymix/csscompressor

Incorrect removal of whitespace in svg

Opened this issue · 4 comments

rixx commented
In [6]: compress('''.header.bubbles {
   ...:   background-image: url("data:image/svg+xml,%3Csvg width='100' height='100' viewBox='0 0 100 100' xmlns='http://www.w3.org/2000/svg'%3E%3Cpath d='M11 18c3.866 0 7-3.134 7-7s-3.134-7-7-7-7 3.134-7 7 3.134 
   ...: 7 7 7zm48 25c3.866 0 7-3.134 7-7s-3.134-7-7-7-7 3.134-7 7 3.134 7 7 7zm-43-7c1.657 0 3-1.343 3-3s-1.343-3-3-3-3 1.343-3 3 1.343 3 3 3zm63 31c1.657 0 3-1.343 3-3s-1.343-3-3-3-3 1.343-3 3 1.343 3 3 3zM34 90
   ...: c1.657 0 3-1.343 3-3s-1.343-3-3-3-3 1.343-3 3 1.343 3 3 3zm56-76c1.657 0 3-1.343 3-3s-1.343-3-3-3-3 1.343-3 3 1.343 3 3 3zM12 86c2.21 0 4-1.79 4-4s-1.79-4-4-4-4 1.79-4 4 1.79 4 4 4zm28-65c2.21 0 4-1.79 4-
   ...: 4s-1.79-4-4-4-4 1.79-4 4 1.79 4 4 4zm23-11c2.76 0 5-2.24 5-5s-2.24-5-5-5-5 2.24-5 5 2.24 5 5 5zm-6 60c2.21 0 4-1.79 4-4s-1.79-4-4-4-4 1.79-4 4 1.79 4 4 4zm29 22c2.76 0 5-2.24 5-5s-2.24-5-5-5-5 2.24-5 5 2.
   ...: 24 5 5 5zM32 63c2.76 0 5-2.24 5-5s-2.24-5-5-5-5 2.24-5 5 2.24 5 5 5zm57-13c2.76 0 5-2.24 5-5s-2.24-5-5-5-5 2.24-5 5 2.24 5 5 5zm-9-21c1.105 0 2-.895 2-2s-.895-2-2-2-2 .895-2 2 .895 2 2 2zM60 91c1.105 0 2-
   ...: .895 2-2s-.895-2-2-2-2 .895-2 2 .895 2 2 2zM35 41c1.105 0 2-.895 2-2s-.895-2-2-2-2 .895-2 2 .895 2 2 2zM12 60c1.105 0 2-.895 2-2s-.895-2-2-2-2 .895-2 2 .895 2 2 2z' fill='white' fill-opacity='0.1' fill-ru
   ...: le='evenodd'/%3E%3C/svg%3E");
   ...: }
   ...: ''')
Out[6]: '.header.bubbles{background-image:url("data:image/svg+xml,%3Csvgwidth=\'100\'height=\'100\'viewBox=\'00100100\'xmlns=\'http://www.w3.org/2000/svg\'%3E%3Cpathd=\'M1118c3.86607-3.1347-7s-3.134-7-7-7-73.134-773.134777zm4825c3.86607-3.1347-7s-3.134-7-7-7-73.134-773.134777zm-43-7c1.65703-1.3433-3s-1.343-3-3-3-31.343-331.343333zm6331c1.65703-1.3433-3s-1.343-3-3-3-31.343-331.343333zM3490c1.65703-1.3433-3s-1.343-3-3-3-31.343-331.343333zm56-76c1.65703-1.3433-3s-1.343-3-3-3-31.343-331.343333zM1286c2.2104-1.794-4s-1.79-4-4-4-41.79-441.79444zm28-65c2.2104-1.794-4s-1.79-4-4-4-41.79-441.79444zm23-11c2.7605-2.245-5s-2.24-5-5-5-52.24-552.24555zm-660c2.2104-1.794-4s-1.79-4-4-4-41.79-441.79444zm2922c2.7605-2.245-5s-2.24-5-5-5-52.24-552.24555zM3263c2.7605-2.245-5s-2.24-5-5-5-52.24-552.24555zm57-13c2.7605-2.245-5s-2.24-5-5-5-52.24-552.24555zm-9-21c1.10502-.8952-2s-.895-2-2-2-2.895-22.895222zM6091c1.10502-.8952-2s-.895-2-2-2-2.895-22.895222zM3541c1.10502-.8952-2s-.895-2-2-2-2.895-22.895222zM1260c1.10502-.8952-2s-.895-2-2-2-2.895-22.895222z\'fill=\'white\'fill-opacity=\'0.1\'fill-rule=\'evenodd\'/%3E%3C/svg%3E")}'

I realize this is a hacky use case, but it'd be neat if it worked, nevertheless.

For those wondering how to get around this issue, I'm currently using the following workaround:

import re
from csscompressor import compress as _compress


capture_svg = re.compile(r'url\("(data:image/svg.*?svg%3E)\"\)')


def compress(css, **kwargs):
    svg = re.findall(capture_svg, css)
    css = _compress(css, **kwargs)
    errors = re.findall(capture_svg, css)

    for find, replace in zip(errors, svg):
        css = css.replace(find, replace)

    return css
jsma commented

@Paradoxis thanks for the suggested workaround! I've borrowed from this to handle the same issue with rccsmin.

One suggestion, the regular expression assumes %3E is case sensitive but Bootstrap 4 also uses the lower case %3e so I had to modify the regex to be case-insensitive and it now captures all of BS4's data URLs. I also modified this to replace spaces with "%20" before passing off to the backend (again, rcssmin in my case).

def compress(css, **kwargs):
    capture_svg = re.compile(r'url\("(data:image/svg.*?svg%3[Ee])\"\)')
    data_urls = re.findall(capture_svg, css)
    for data_url in data_urls:
        css = css.replace(data_url, data_url.replace(' ', '%20'))
    css = cssmin(css, **kwargs)
    return css
ysard commented

Hi, I purpose another "cleaner" resolution via Monkey patching.

The line responsible for the modification in the url(.*) is here:

if remove_ws:
token = _ws_re.sub('', token)

In the function:
def _preserve_call_tokens(css, regexp, preserved_tokens, remove_ws=False):

In _compress, the calls of this function are here:

preserved_tokens = []
css = _preserve_call_tokens(css, _url_re, preserved_tokens, remove_ws=True)
css = _preserve_call_tokens(css, _calc_re, preserved_tokens, remove_ws=False)
css = _preserve_call_tokens(css, _hsl_re, preserved_tokens, remove_ws=True)

The first call is what we are interested in because it handles the url pattern:

_url_re = re.compile(r'''(url)\s*\(\s*(['"]?)data\:''', re.I)

...but it wrongly uses the keyword parameter remove_ws=True.

I wonder how the url pattern content can be optimized anyway?

The solution here is to wrap the function _preserve_call_tokens and dynamically reinject it in the module before doing anything else (Monkey Patch):

import csscompressor

# Monkey patch
_preserve_call_tokens_original = csscompressor._preserve_call_tokens
_url_re = csscompressor._url_re


def my_new_preserve_call_tokens(*args, **kwargs):
    """Patch the keyword for url pattern handling in csscompressor"""
    if _url_re == args[1]:
        kwargs["remove_ws"] = False
    return _preserve_call_tokens_original(*args, **kwargs)

csscompressor._preserve_call_tokens = my_new_preserve_call_tokens


print("original func", id(_preserve_call_tokens_original))
print("modified func", id(my_new_preserve_call_tokens))
assert csscompressor._preserve_call_tokens == my_new_preserve_call_tokens

Another solution much less maintainable and readable is to monkey patch the Abstract Syntax Tree of the function:

import ast
import inspect
# Get source code of _compress
source_func = inspect.getsource(csscompressor._compress)
# Parse the source code into syntax tree
ast_func = ast.parse(source_func)
# Patch the keyword
ast_func.body[0].body[3].value.keywords[0].value.value = False
# Get python bytecode
bytecode = compile(ast_func, '<string>', 'exec')
# Inject bytecode in the module
exec(bytecode, csscompressor.__dict__)

@ysard Thanks for the solution!