Redocly/redocly-cli

join should allow longer lines and retain original quotes

steve-nay-sage opened this issue · 4 comments

Describe the bug

For some mysterious reason, the join command seems to hate long lines so it splits them into multiple shorter lines and inserts >- to indicate how newlines should be handled. Plus it strips off needed quotation marks. Here is an example of a "long" line before the join command:

  - name: Terms
    description: An Accounts Payable term is a rule that a vendor establishes for extending credit to your company. Terms can be associated with transactions or with specific vendors.

and after:

- name: Terms
    description: >-
      An Accounts Payable term is a rule that a vendor establishes for extending
      credit to your company. Terms can be associated with transactions or with
      specific vendors.

You'd be right if you thought, "That original line doesn't look very long at all!" Yes, there is no reason that I can think of for join to do anything to that line.

I'm opening this bug because turning single lines into multiple lines can cause problems for some other tools that need to parse the files produced by join. Also, when this happens with keywords other than description it can cause the schema to be invalid. For example:

    $ref: >-
      #/components/schemas/cash-management-payment-provider-bank-account

The reference needs to be in quotes. It was in quotes in the original file, but join stripped them out when it thought the line was too long. Other (shorter) references still have their quotes after join.

To Reproduce

Run redocly join on a couple of OpenAPI YAML files that have some lines that are longer than 80 characters.

Expected behavior

For long lines, the output should be exactly what the input was. There was nothing wrong with the original lines, and there is no benefit to breaking them into shorter lines. There is certainly no benefit to removing needed quotation marks-- I'd call that a bug.

Redocly Version(s)

1.10.5

Node.js Version(s)

18.7.0

Additional context

The join command is incredibly useful-- thank you! Just need to fix the little bugs that show up.

For long lines, the output should be exactly what the input was.

Unfortunately that's not possible. There is only one good yaml parser in JS out there and it doesn't preserve original formatting (it also looses comments, quotes, string styles, etc). Adding support for this is complex (I already tried) as the YAML spec is pretty complex and the parser is sophisticated.

There is another parse which works on AST level but then it means we will have to implement join over the AST, not over the data object which is way harder too.

Why would you edit the result of the join command? Just to clarify, the resulting file despite differences in formatting will parse to exactly same data structure. You can try to format the output with prettier. Maybe it produces a better result.

Thanks @steve-nay-sage , this is good info! Like @RomanHotsiy says, keeping things identical is difficult because we're parsing it in, making a new structure, and writing it back out. We should only be adjusting the lines for markdown fields though, that $ref example doesn't look correct. I think the quotes could do with another look as well.

Why would I edit the results of the join command? In this case I needed to add the quotes back that join (or the parser) removed from $ref lines that it thought were too long. So I modified my script that was calling join to do some cleanup after join finished.

Oh, I see but I'm confused a bit.

The result of parsing this yaml:

$ref: >-
  #/components/schemas/cash-management-payment-provider-bank-account

is exactly same as this yaml:

$ref: "#/components/schemas/cash-management-payment-provider-bank-account"

According to the yaml spec and I just tried a few online YAML to JSON convertors to confirm that.
Which tooling you use? Is it some tooling that doesn't parse yaml but works on raw strings?