Inaccurate information for the "Runes" Concept

Question

Inaccurate information for the "Runes" Concept

jayd-lee opened this issue 9 months ago · 1 comments

Hello, I noticed some potentially misleading information in the "Runes and Strings" section of the documentation:

"
Runes and Strings
Strings in Go are encoded using UTF-8 which means they contain Unicode characters. Since the rune type represents a Unicode character, a string in Go is often referred to as a sequence of runes. However, runes are stored as 1, 2, 3, or 4 bytes depending on the character. Due to this, strings are really just a sequence of bytes. In Go, slices are used to represent sequences and these slices can be iterated over using range.
"

In my understanding, since runes are an alias to int32 (which is an explicitly sized integer type), they always occupy 4 bytes, regardless of the character's UTF-8 encoding size. Therefore, the statement "runes are stored as 1, 2, 3, or 4 bytes" might be misleading.

Suggested Change:
To accurately reflect the behavior of runes in Go, it might be necessary to clarify that while characters within a string may require different numbers of bytes when encoded in UTF-8, runes themselves are consistently represented as 4 bytes.

Thanks!

Answer 1 · 2024-03-02T02:34:43.000Z

Hello. Thanks for opening an issue on Exercism 🙂

At Exercism we use our Community Forum, not GitHub issues, as the primary place for discussion. That allows maintainers and contributors from across Exercism's ecosystem to discuss your problems/ideas/suggestions without them having to subscribe to hundreds of repositories.

This issue will be automatically closed. Please use this link to copy your GitHub Issue into a new topic on the forum, where we look forward to chatting with you!

If you're interested in learning more about this auto-responder, please read this blog post.