- clone the project
- There are two application one is for password and another one for intersection
- for password app
- Go to src/app/bloom_filter/password
- To run: python newpass.py
- please enter values with double quote
- for Intersection app
- Go to src/app/bloom_filter/intersection
- To run: python intersection.py
- please enter values with double quote
What is the fastest and space efficient way to search for a word that is present or not?
- HashTable?
- Tries? Above two data structure optimised in time complexity but comprimise in space. To overcome all these problems and to do it effective time and space we use Bloom filters
A Bloom filter is a space-efficient probabilistic data structure, that is used to test whether an element is a member of a set. Bloom filter options ➢ Insert ➢ Search
- An empty Bloom filter is a bit array of m bits, all set to 0. There must also be k different hash functions defined, each of which maps or hashes some set element to one of the m array positions, generating a uniform random distribution.
- To add an element, feed it to each of the k hash functions to get k array positions. Set the bits at all these positions to 1.
- To search an element, feed it to each of the k hash functions to get k array positions
- If any of the bits at these positions is 0, the element is definitely not in the set – if it were, then all the bits would have been set to 1 when it was inserted
- If all are 1, then either the element is in the set, or the bits have by chance been set to 1 during the insertion of other elements, resulting in a false positive.
Deleting elements from filter is not possible because, if we delete a single element by clearing bits at indices generated by k hash functions, it might cause deletion of few other elements. Bloom filters do not permit deletion of items
Whenever you are trying to set password for something , it should be strong enough and be harder to guess.Here we are doing a small approach using bloom filters which avoid users keeping easy passwords.
When you are given two huge sets of data and is asked to find the common data between them. We can easily find them using Bloom filters.