datadesk/django-softhyphen

Stripping whitespace should not be default

Opened this issue · 0 comments

Why does django softhyphen strip whitespace indefinitely? I have a case where the paragraph I'm hyphenating has a style tag <em> around the first word of the paragraph, and that text has a space at the end of the text in the <em> tag, or after the </em> tag which is getting stripped out.

This is an example of what the text looks like with the space inside the <em> tag:

<p><em>Test. </em>This is a test paragraph.</p>

This is an example of what the text looks like with the space after the <em> tag:

<p><em>Test.</em> This is a test paragraph.</p>

The result after hyphenation in either case is:

<p><em>Test.</em>This is a test paragraph.</p>

I can fix this locally by changing STRIP_WHITESPACE.sub(...) to re.sub(...) but it would be nice to be able to choose if I want to strip whitespace or not. Is there a reason the whitespace is always stripped? Can this be default to strip, but I can override it?