IDL Model serializer ignores trailing whitespace in documentation
kubukoz opened this issue · 2 comments
If you parse this model:
namespace test
@documentation("foo
bar")
string Demo
very important: there's a space after foo, before the newline.
then the value of the documentation
trait is: "foo \nbar"
, which I think is correct.
If you serialize that with the SmithyIdlModelSerializer
, it gets turned into:
$version: "2.0"
namespace test
/// foo
/// bar
string Demo
and the trailing space is gone. As a result, if you parse that again, you no longer have the same documentation trait value (it's foo\nbar
instead), which causes the following problems for my team:
-
model merging considers these shapes different (because of the trait change), which fails to assemble.
-
Name uniqueness checks start to consider the shape different from its original counterpart (for a specific reason, we have an identical shape in another namespace, but it's used verbatim instead of being parsed + serialized, so it still has the trailing space).
I think this is an inconsistency in the serializer - as much as I love to remove trailing whitespace, it clearly can cause subtle bugs.
Full example, a scala-cli script:
//> using dep "software.amazon.smithy:smithy-model:1.43.0"
//> using option "-Wunused:all"
import software.amazon.smithy.model.shapes.SmithyIdlModelSerializer
import software.amazon.smithy.model.traits.DocumentationTrait
import software.amazon.smithy.model.shapes.ShapeId
import software.amazon.smithy.model.Model
import scala.jdk.CollectionConverters._
val input =
s"""
|namespace test
|
|@documentation("foo${" " /* this ensures we have trailing whitespace no matter what editor config you use */}
|bar")
|string Demo
|""".stripMargin
val m1 = Model
.assembler()
.addUnparsedModel(
"test.smithy",
input
)
.assemble()
.unwrap()
val docValue = m1
.expectShape(ShapeId.from("test#Demo"))
.getTrait(classOf[DocumentationTrait])
.get
.getValue
println(docValue == "foo \nbar") //true
val serialized = SmithyIdlModelSerializer
.builder()
.build()
.serialize(m1)
.asScala
.head
._2
val m2 = Model
.assembler()
.addUnparsedModel(
"test.smithy",
serialized
)
.assemble()
.unwrap()
val docValue2 = m2
.expectShape(ShapeId.from("test#Demo"))
.getTrait(classOf[DocumentationTrait])
.get
.getValue
println(docValue2 == "foo\nbar") //true
Edit: this is also inconsistent with the ModelSerializer
which encodes as a Node
- that one keeps the whitespace.
I agree that the smithy-model serializers should keep the trailing whitespace like you said. However, I would expect the formatter in smithy-syntax to remove it by default though since I think that's what the vast majority of people would expect.
I agree that the smithy-model serializers should keep the trailing whitespace like you said. However, I would expect the formatter in smithy-syntax to remove it by default though since I think that's what the vast majority of people would expect.
we're planning to switch to that formatter when it's in the LSP 😅 which is happening in smithy-lang/smithy-language-server#139