Anagram: add test case with UTF8 characters
Closed this issue · 5 comments
In Go, strings are made of UTF8 runes. Therefore, it is not appropriate to assume strings as just English letters, unless it is explicitly said so in the description.
However, in the exercise Anagram, the test cases only cover English letter strings, although "你好" and "好你" should also be treated as anagrams. This encourages some "clever" solutions using [26]int
to represent a word, which I think is theoretically wrong, as strings are not a collection of English letters.
Perhaps we should add some test cases to this to capture these scenarios. Also some descriptions could be added for corner cases (e.g. how to treat punctuations? etc)
Or we can specify in the description that the input strings only contain English letters. However, in that case, the better type to use would be []byte
instead of string
.
I agree that for Go the test cases should be extended to include an example with runes that use more than one byte (non-english letters). The current example solution in the repo can already handle this correctly so it does not need to be changed.
I don't think it is necessary to change the description.
- The test case will be enough to guide people that wrote some solution that does not handle the non-english letters correctly.
- In most practice exercises the instructions don't cover all the edge cases so I don't think we have to go into explaining what happens for punctuation etc. here. The guideline is usually that you only write as much code as needed to fulfill the mentioned requirements and pass all tests (TDD style).
The difficulty here is that the test cases are produced with the generator and I don't think the generator can handle adding additional test cases currently. There is more investigation needed how we can support this. (I created a separate issue for this.)
Yes! The documentation how to add custom test cases can be found here: https://github.com/exercism/go#tests
@CoderYihaoWang Do you want to add test cases?