Callidon/bloom-filters

import / export BloomFilter not working as expected

imcotton opened this issue · 6 comments

Hi, these functionalities seemed have issue:

static create (items = 1000, errorRate = 0.001, seed = utils.getDefaultSeed()) {
const size = fm.optimalFilterSize(items, errorRate)
const hashes = fm.optimalHashes(size, items)
const filter = new BloomFilter(size, hashes)
filter.seed = seed
return filter
}

  • BloomFilter.create should be the correct way to construct a filter since it calculates sizing internally, but forgot to save its _errorRate for serialization later on

const BloomFilterSpecs = {
export: cloneObject('BloomFilter', '_errorRate', '_size', '_length', '_nbHashes', '_filter', '_seed'),
import: (FilterConstructor, json) => {
if ((json.type !== 'BloomFilter') || !assertFields(json, '_errorRate', '_size', '_length', '_nbHashes', '_filter', '_seed')) {
throw new Error('Cannot create a BloomFilter from a JSON export which does not represent a bloom filter')
}
const filter = new FilterConstructor(json._capacity, json._errorRate)
filter.seed = json._seed
filter._size = json._size
filter._nbHashes = json._nbHashes
filter._filter = json._filter.slice(0)
filter._length = json._length
return filter
}
}

  • because _errorRate has not been saved, it left as undefined during exports, then got omitted via JSON.stringify, which later leading to throw an exception from assertFields by import
  • _capacity has not been set onto the instance nor serialization, but treat as size to feed into FilterConstructor

class BloomFilter extends Exportable {
/**
* Constructor
* @param {int} size - the number of cells
* @param {number} nbHashes - the number of hash functions used
*/
constructor (size = 1000, nbHashes = 4) {

Thank you for sharing this. I will fix this tomorrow.

@imcotton is it the last available version btw? or an older one?

The latest version from the npm (0.8.0).

The proper way to create a bloom filter is using the current constructor if you want to customize the number of hash functions used or the size of the filter. But if you dont, use the static .create() function. So I deleted the _errorRate property. The rate is computed/choosed when you construct the filter. Even if you export and import the filter you should have the same error rate between those 2 instances. If you want to add this property to your export json value call the .rate() method then add it to your object.

I am able to reproduce the issue only by using a serialization step before importing the structure. Otherwise it works correctly.

let exported = filter.saveAsJSON()
// simulate serialization
exported = JSON.stringify(exported)
// simulate deserialization
exported = JSON.parse(exported)
const newFilter = BloomFilter.fromJSON(exported)
  • If your a user of JSON.stringify for serialization, dont forget that it removes every undefined property.

Tempory Workaround: Just add _errorRate before importing the filter if you dont want to bump to the new version.

Fix: I removed the _errorRate from the export/import spec. It should work correctly now.

@Callidon Can you bump the version? Fix is pushed in the lasts commit. I bump the version to the 0.8.1.

The new version 0.8.1 is tagged and available on npm. Thanks @folkvir for the fix!