Parsing ignores invalid characters

Question

Parsing ignores invalid characters

mattmoyer opened this issue 7 years ago · 0 comments

When I was looking into #9, I noticed that the Parse() and UnmarshalText() methods do not check the input against the Base32 character set. This means that all sorts of invalid text ULIDs will parse into a seemingly valid ULID.

Is this expected? It breaks some assumptions I had made when parsing text-encoded ULIDs from an untrusted source.

The unmarshalling code is borrowed from NUlid, but it has an extra check that is lacking in this library: https://github.com/RobThree/NUlid/blob/89f5a9fc827d191ae5adafe42547575ed3a47723/NUlid/Ulid.cs#L250-L252

Example

package main

import (
	"log"
	"github.com/oklog/ulid"
)

func main() {
	input := " !\"#$%&'()*+_-/[]|~~\\|<>?\x00"
	id, err := ulid.Parse(input)
	if err != nil {
		log.Fatalf("could not parse ULID: %v", err)
	}
	log.Printf("Parse(%q).String() == %q", input, id.String())
}

$ go run parseinvalid.go
2018/01/30 13:39:11 Parse(" !\"#$%&'()*+_-/[]|~~\\|<>?\x00").String() == "7ZZZZZZZZZZZZZZZZZZZZZZZZZ"

Actual behavior

ulid.Parse() returns a nil error for values like " !\"#$%&'()*+_-/[]|~~\\|<>?\x00".

Expected behavior

ulid.Parse() returns a new ulid.ErrInvalidCharacters error for any text input outside of ulid.Encoding (case insensitive).