Function to find and remove ASCII art from block comments is too finicky
shawnhyam opened this issue · 4 comments
The code has trouble removing the leading *
characters from block comments. This can be seen fairly easily with the UseTripleSlashForDocumentationComments
rule enabled, but also impact the ValidateDocumentationComments
rule, since it depends on removing the ASCII art characters.
Problem 1
If the first line of the block comment has anything other than /**
on it, including trailing whitespace, the ASCII art is not removed. Examples:
/** start of my block comment,
* this will be a problem
*/
turns into
/// start of my block comment,
/// * this will be a problem
/**
* the line above has trailing whitespace, that's also an issue
*/
turns into
///
/// * the line above has trailing whitespace, that's also an issue
Problem 2
If the ASCII art characters are indented any amount other than a single space, they are not removed. Examples:
/**
* this won't work, 2 leading spaces
*/
turns into
/// * this won't work, 2 leading spaces
/**
* this won't work, 0 leading spaces
*/
turns into
/// * this won't work, 0 leading spaces
Problem 3
If the comment block is closed with more than 1 asterisk, the extras won't be removed. Example:
/**
* so far so good
**/
turns into
/// so far so good
/// *
Synced to Apple’s issue tracker as rdar://128613075
FWIW, my implementation is meant to match libMarkup in the Swift compiler as precisely as possible. If you put the first comment into Xcode and open up the doc popup, you'll see that the compiler parses it the same way (it has the brief summary, and then a bulleted list with one element):
/** start of my block comment,
* this will be a problem
*/
var x: Int
So the behavior of most of these is working as intended; we want to parse them exactly as the compiler would because changing the format of the comment from a doc block comment to a doc line comment should not alter how the Swift compiler would parse it structurally.
If you find any cases where swift-format's ASCII art extraction specifically works differently than the Swift compiler itself, then please report those; those would not be intentional.
I see, that makes sense. I was a bit thrown off because some of the samples in CommentTests.swift
are formatted in these non-conforming ways. My plan was to use your implementation to get the raw text to send through the Markdown formatter; does this seem like the right way to go in your opinion?
I think a better approach would be to parse each comment using DocumentationComment
as soon as you can and traffic that type throughout, since it already handles that. That type extracts information about the comments into a structured representation. You'd need to add an API to construct a new swift-markdown Document
from those parts again, but that gives us more power to do things like enforce ordering of Parameter(s)
vs Returns
vs Throws
sections as part of the formatting operation, which we can't do if we just send the text directly through a reflowing pass.