CenterForOpenScience/pydocx

Issue with space when converting to html

Closed this issue · 3 comments

We have such a use case when we try to convert to html. Let's take as example the doc
missing_spacing_example.docx. When we convert it to html we have:

<span class="pydocx-center"><strong>Hourly</strong><strong>Sales</strong><strong>Survey</strong>

So, at this point all the spaces between words are gone. There are cases when we need those space(especially when we have a sentence that also contains parentheses).

This issue seems to be that <w:t xml:space="preserve"> </w:t> is being ignored by the output.

Is there a way to have those spaces in html?

Thx.

Hi Chirica,

What version of pydocx are you using?

I converted the file you provided using the current version (0.9.5) and this was the result I got:

<body><p><span class="pydocx-center"><strong>Hourly</strong> <strong>Sales</strong> <strong>Survey</strong></span></p></body>

The spaces are preserved.

-Kyle

Actually you are right. Spaces are present among <strong> tags. I thought that spaces should be in the strong tag. Well, in this case I guess that issue is invalid. Thx for your help.

You're welcome!