Getting duplicate ID's
kellabyte opened this issue · 11 comments
I'm using this library to generate ULID's for a primary key in a database and when I run my benchmarks I'm finding a lot of duplicate primary key errors from my database. I'm assuming I'm using this library wrong because. What is the guidance on how to use this to avoid generating duplicate keys?
I'm doing the following.
now := time.Now()
entropy := ulid.Monotonic(rand.New(rand.NewSource(now.UnixNano())), 0)
transaction, err := pool.Begin()
for i := 0; i < b.N; i++ {
id := ulid.MustNew(ulid.Timestamp(now), entropy)
// INSERT SQL
tx.Exec(INSERT_SQL_USING_id)
}
tx.Commit()
What have I done wrong that generates duplicate ULID's?
@kellabyte I am also using the same code. It is giving me duplicate also.
This is quite strange. Are you possibly instantiating a new entropy source per request? That could explain what you're seeing.
I just wrote this additional test case which is passing. Can you test it on your machine?
func TestMonotonicUniqueness(t *testing.T) {
t.Parallel()
now := time.Now()
entropy := ulid.Monotonic(rand.New(rand.NewSource(now.UnixNano())), 0)
set := make(map[string]int, 1e7)
for i := 1; i <= 1e7; i++ {
id := ulid.MustNew(ulid.Timestamp(now), entropy).String()
if j := set[id]; j != 0 {
t.Fatalf("duplicate of i=%d %s generated at i=%d", j, id, i)
} else {
set[id] = i
}
}
}
This looks like almost identical code from what @kellabyte posted, which shows entropy source created outside the loop.
This looks like almost identical code from what @kellabyte posted, which shows entropy source created outside the loop.
Does it fail on your machine?
I modified the test case into this program and also ran it without issue.
$ for n in 1000 10000 100000 1000000 10000000 100000000 ; echo n=$n ; time ./uliduniq -n $n ; echo ; end
n=1000
created 1000 without incident
0.00 real 0.00 user 0.00 sys
n=10000
created 10000 without incident
0.01 real 0.00 user 0.00 sys
n=100000
created 100000 without incident
0.06 real 0.05 user 0.00 sys
n=1000000
created 1000000 without incident
0.89 real 0.90 user 0.03 sys
n=10000000
created 10000000 without incident
9.41 real 11.31 user 0.58 sys
n=100000000
created 100000000 without incident
122.62 real 176.96 user 8.08 sys
We speculated on Slack that it might be the database driver or database truncating the field, a few other people chimed in to say it happened to them in the past. I suggest modifying either the test case or my program to insert the ULID both into the database and the in-memory map, and check for duplicates after every insert. If you find a discrepancy, i.e. if the database complains about duplicates but the map doesn't, then you'll find your culprit.
@kellabyte: Any new insights?
Absent an update in the next few days, I'll assume this issue is no longer relevant and will go ahead and close it. If that's not the case, please feel free to re-open.
I have duplicate IDs. Shall I open a new issue or comment here? Thanks
Open a new issue, I guess.
For future readers...
now := time.Now()
entropy := ulid.Monotonic(rand.New(rand.NewSource(now.UnixNano())), 0)
transaction, err := pool.Begin()
for i := 0; i < b.N; i++ {
id := ulid.MustNew(ulid.Timestamp(now), entropy)
Unless you explicitly want to generate a set of ULIDs fixed to the same point (millisecond) in time, this is probably a bug.
You probably want
id := ulid.MustNew(ulid.Timestamp(time.Now()), entropy)