ktdreyer/jenkins-job-wrecker

Indentation is missing during conversion

Closed this issue · 4 comments

Hello,

We have realized that the indentation of multiline values have been missing due to

def str_presenter(dumper, data):
if len(data.splitlines()) > 1: # check for multiline string
# The dumper will not respect "style='|'" if it detects trailing
# whitespace on any line within the data. For scripts the trailing
# whitespace is not important.
lines = [l.strip() for l in data.splitlines()]
data = '\n'.join(lines)
return dumper.represent_scalar('tag:yaml.org,2002:str', data,
style='|')
return dumper.represent_scalar('tag:yaml.org,2002:str', data)

As it strips the lines, therefore deletes the whitespaces in the beginning and end, which is an unintended situation especially on scripts. For example:

<shell>if [[ test.txt ]]; then
    echo "hello"
fi</shell>

Should be converted as:

shell: |
    if [[ test.txt ]]; then
        echo "hello"
    fi

But it's converted as:

shell: |
    if [[ test.txt ]]; then
    echo "hello"
    fi

I have realized that the problem is actually not caused by trailing whitespaces, it's caused by \r (carriage return) which is &#xd; in XML correspondence. If &#xd; is removed from the XML, it works fine with

def str_presenter(dumper, data):
  if len(data.splitlines()) > 1:  # check for multiline string
    return dumper.represent_scalar('tag:yaml.org,2002:str', data, style='|')
  return dumper.represent_scalar('tag:yaml.org,2002:str', data)

The only part that I have seen a carriage-return appearing so far is in description. Instead of stripping all the lines which would break indentation for any kind of multiline value, I believe that we should do this only on description or remove all the &#xd; values before starting to parse the XML file. AFAIK, both ways do the work.

There may be a better way for solving this problem and that is why I am opening this ticket. If you have any ideas, please leave a comment.

The comment in the code says that it is trying to avoid trailing whitespace, but we're calling plain strip() here, which removes both leading and trailing whitespaces. What happens when you change that to rstrip()?

The comment in the code says that it is trying to avoid trailing whitespace, but we're calling plain strip() here, which removes both leading and trailing whitespaces. What happens when you change that to rstrip()?

Sorry for the late reply. While I was investigating, I thought that rstrip did not do the job, but now I testing it again, It seems it works just fine! I will open a PR. Thank you so much for the suggestion!

Cheers!

Edit: here it is, #114

Cool, thanks! Are you ok with closing this PR issue, in that case?

I will open a new PR, added a test. #116