swiftlang/swift-syntax

Incorrect value of trimmedByteRange in certain case.

Closed this issue · 6 comments

Description

There is a strange incorrect value of trimmedByteRange, when using ❤️ . I tried with another emoji , and there is no problem.

Steps to Reproduce

//Bug
let str1 = """
❤️


func foo(str: String) {
	
	

}
"""

//Bug
let str2 = """
❤️❤️❤️❤️❤️


func foo(str: String) {
	
	

}
"""

//Normal
let str3 = """
a❤️❤️❤️❤️❤️


func foo(str: String) {
	
	

}
"""

//Normal
let str4 = """



func foo(str: String) {
	
	

}
"""

public override func visit(_ node: FunctionDeclSyntax) -> SyntaxVisitorContinueKind {

	**let offset = node.trimmedByteRange.offset // -> 0 for str1 and str2 (Bug)**
	
	return .visitChildren
}

Tracked in Apple’s issue tracker as rdar://114755239

The following response is under the assumption that you are expecting trimmedByteRange to strip away the hearts. If you’re expecting something else, please re-open the issue and let me know.

This behaves correctly. From the doc comment of trimmedByteRange:

The byte source range of this node excluding leading and trailing trivia.

But if you print the tree, you’ll see that the hearts aren’t trivia but unexpected nodes within the tree. Since trimmedByteRange doesn’t remove unexpected tokens, this behavior is expected.

FunctionDeclSyntax
├─attributes: AttributeListSyntax
├─modifiers: DeclModifierListSyntax
├─unexpectedBetweenModifiersAndFuncKeyword: UnexpectedNodesSyntax
│ ╰─[0]: binaryOperator("❤️")
├─funcKeyword: keyword(SwiftSyntax.Keyword.func) leadingTrivia=newlines(3) trailingTrivia=spaces(1)
├─name: identifier("foo")
├─signature: FunctionSignatureSyntax
│ ╰─parameterClause: FunctionParameterClauseSyntax
│   ├─leftParen: leftParen
│   ├─parameters: FunctionParameterListSyntax
│   │ ╰─[0]: FunctionParameterSyntax
│   │   ├─attributes: AttributeListSyntax
│   │   ├─modifiers: DeclModifierListSyntax
│   │   ├─firstName: identifier("str")
│   │   ├─colon: colon trailingTrivia=spaces(1)
│   │   ╰─type: IdentifierTypeSyntax
│   │     ╰─name: identifier("String")
│   ╰─rightParen: rightParen trailingTrivia=spaces(1)
╰─body: CodeBlockSyntax
  ╰─leftBrace: leftBrace

What you might try instead, is to get the first token in the fixed tree using tree.firstToken(viewMode: .fixedUp), which will skip over unexpected tokens. You can then get that token’s position/location/offset/...

I don't understand why if there are unexpected nodes outside the range of the function, FuncDecl will include them?

Screenshot 2023-09-01 at 4 30 45 AM

I believe it's because instead of a heart symbol, you could have attributes or modifiers there. If I'm not mistaken, the following Swift code is valid:

@available(*, deprecated, message: "❤️")



func foo() { print("don't use me :'(") }

I believe it's because instead of a heart symbol, you could have attributes there. If I'm not mistaken, the following Swift code is valid:

@available(*, deprecated, message: "❤️")



func foo() { print("don't use me :'(") }

There is no problem to take the attributes as a part of the FuncDecl, my question is why FuncDecl must include the unexpected node that is outside the range of the function?

❤️



@available(*, deprecated, message: "❤️")


func foo() {
	print("don't use me :'(")
}
Screenshot 2023-09-01 at 10 29 06 AM

Another thing I didn't understand, why only the Unexpected Nodes that are above a FuncDecl are included while in VariableDecl the Unexpected Nodes that are above and below are both included?

Screenshot 2023-09-01 at 10 41 52 AM Screenshot 2023-09-01 at 10 42 01 AM

We need to include the unexpected tokens somewhere and including them as part of the FunctionDeclSyntax is as good a place as any other. I.e. would it really make any more sense to include them in the CodeBlockItem or at the start of the SourceFile? I don’t think so.

From an implementation perspective: The way the parser works is that it skips unexpected tokens during the parser and then attaches these unexpected tokens in front of the token that it was expecting – in your case, this is the func token.