A Bloom Filter is a space-efficient, probabilistic data structure used to quickly test whether an element is possibly in a set or definitely not.
-
Fast & Memory-Efficient: Uses much less space than a traditional set.
- Bloom filters require significantly less space because they do not store the keys themselves, making them very efficient for membership checks.
-
No False Negatives: If it says "not present," the element is definitely not in the set.
-
Possible False Positives: If it says "present," the element might be in the set (but might not be).
- As the number of inserted keys increases, the false positive rate increases.
- Therefore, as the number of keys grows, the Bloom filter size should also increase to keep false positives low.
Note
- A false positive means the system says “yes, it’s here” when actually it’s not.
- A false negative means the system says “no, it’s not here” when actually it is.
A simple Bloom filter implementation can be found here: https://github.com/nmdra/bloom-filter
Source: https://redis.io/docs/latest/develop/data-types/probabilistic/bloom-filter/
This HTTP signup service in Go uses RedisBloom to prevent duplicate email registrations. This approach identifies duplicate emails without costly database queries.
-
Start Redis:
docker run -p 6379:6379 --name redis-bloom redis:latest
-
Run the app:
go run main.go
-
Test:
curl -X POST http://localhost:8080/signup \ -H "Content-Type: application/json" \ -d '{"email":"test@example.com"}'