golang/go

go/parser: eats \r in comments

dvyukov opened this issue · 4 comments

In the following program go/parser/printer turn a correct program into incorrect one, and so the second parsing fails:

package main

import (
    "bytes"
    "fmt"
    "go/parser"
    "go/printer"
    "go/token"
)

func main() {
    data := []byte("package A   /**\r/*/\n")
    fset := token.NewFileSet()
    f, err := parser.ParseFile(fset, "src.go", data, parser.ParseComments|parser.DeclarationErrors|parser.AllErrors)
    if err != nil {
        return
    }
    buf := new(bytes.Buffer)
    printer.Fprint(buf, fset, f)
    fset1 := token.NewFileSet()
    _, err = parser.ParseFile(fset1, "src.go", buf.Bytes(), parser.ParseComments|parser.DeclarationErrors|parser.AllErrors)
    if err != nil {
        fmt.Printf("source0: %q\n", data)
        fmt.Printf("source1: %q\n", buf.Bytes())
        panic(err)
    }
}
source0: "package A\t/**\r/*/\n"
source1: "package A\t/**/*/\n"
panic: src.go:1:15: expected ';', found '*'

go version devel +b0532a9 Mon Jun 8 05:13:15 2015 +0000 linux/amd64

This is non-urgent and extremely unlikely to be a problem in real-world programs.

Background: This is done deliberately, the go/scanner strips \r from comments explicitly ( http://golang.org/cl/6225047 ) - this was done in response to issue #3647 .

That said, perhaps the original change was not ideal or too aggressive, and/or the spec should be clarified (as in \r are stripped from comments).

  1. The spec should not be changed since comments (and thus their contents) are handled like white space independent of content.
  2. The scanner should probably leave \r in //-style comments but for one followed immediately by \n (the spec considers this "the end of the line" ( http://golang.org/ref/spec#Comments ).

Alternatively, the scanner may not change comment contents at all, but the printer could apply the fix where needed.

Change https://golang.org/cl/87498 mentions this issue: go/scanner: don't eat \r in comments if that shortens the comment