← back to the article
Bloom Filters — Implementations

The Code,
Distilled

Six dependency-free JavaScript implementations to drop into any project. Standard, counting, scalable, partitioned, cuckoo, rotating — each under 80 lines, each runnable in your browser. Click any code block, copy, paste.

1 — MurmurHash3 (32-bit, seeded)

The article uses MurmurHash3 throughout. It's the most common bloom-filter hash: very fast (~0.3 ns/byte on modern x86), excellent uniform distribution, simple enough to fit in a screen. This is the canonical 32-bit variant — port it once and reuse for every filter type below.

// hash · ~30 lines · zero deps
// MurmurHash3 — 32-bit, seeded variant by Austin Appleby.
//
// Why this hash for bloom filters?
//   1. Speed: ~0.3 ns/byte — orders of magnitude faster than SHA-256
//      and we don't need crypto guarantees, just uniform distribution.
//   2. Avalanche: a one-bit input change flips ~50% of output bits,
//      which is exactly what bloom filters need (see avalanche demo).
//   3. Tiny: fits in 20 lines of JS, easy to audit and port.
//
// The magic constants (0xcc9e2d51, 0x1b873593, 0xe6546b64) were
// chosen empirically by Appleby to maximize avalanche over millions
// of test inputs. They are not arbitrary — changing any of them
// degrades distribution measurably.
function murmur3(str, seed = 0) {
  let h = seed >>> 0;             // >>> 0 forces unsigned 32-bit
  const c1 = 0xcc9e2d51, c2 = 0x1b873593;

  // Body: process one input char at a time. Real Murmur3 processes
  // 4 bytes at a time, but for JS strings (UTF-16 code units) the
  // per-char approach is simpler and still uniformly distributed.
  for (let i = 0; i < str.length; i++) {
    let k = str.charCodeAt(i);
    k = Math.imul(k, c1);                // Math.imul: 32-bit safe multiply
    k = (k << 15) | (k >>> 17);          // rotate left 15 → spread high bits
    k = Math.imul(k, c2);
    h ^= k;                              // XOR-mix into the running state
    h = (h << 13) | (h >>> 19);          // rotate left 13 → diffuse
    h = (Math.imul(h, 5) + 0xe6546b64) >>> 0;
  }

  h ^= str.length;                       // Length-mix: prevents trivial prefix collisions

  // Finalization mix ("fmix32"). Three xorshift+multiply rounds
  // that turn any remaining structure in h into white noise. This
  // is the step that makes flipping one input bit flip ~half the
  // output bits — the avalanche property.
  h ^= h >>> 16; h = Math.imul(h, 0x85ebca6b) >>> 0;
  h ^= h >>> 13; h = Math.imul(h, 0xc2b2ae35) >>> 0;
  h ^= h >>> 16;
  return h >>> 0;                      // Final unsigned coercion
}


// Kirsch-Mitzenmacher (2006) — derive k bloom-filter positions
// from just 2 base hashes:
//
//     gᵢ(x) = h₁(x) + i · h₂(x)   mod m       for i = 0 … k-1
//
// The proof: as long as h₁ and h₂ are pairwise independent, the
// gᵢ are "good enough" for asymptotic FPR — adding negligible
// error vs. computing k separate hashes. Hashing cost drops from
// O(k) to O(2) per insert/lookup. This is THE optimization every
// production bloom filter uses today.
function getPositions(item, k, m) {
  const s = String(item);
  const h1 = murmur3(s, 0x9747b28c);
  // `| 1` forces h₂ to be odd. m is usually even, so an odd
  // step ensures gcd(h₂, m) = 1 → the sequence h₁, h₁+h₂,
  // h₁+2h₂, … visits every position before repeating, avoiding
  // pathological clustering when m and h₂ share a factor.
  const h2 = murmur3(s, 0xe17a1465) | 1;
  return Array.from({length: k},
    (_, i) => ((h1 + i * h2) % m + m) % m); // double mod: % can return negative for negative operands
}

2 — Standard BloomFilter

The reference implementation. Constructor takes expected item count n and target false positive rate p; computes optimal m and k*, allocates a Uint8Array. Mergeable, serializable, ~70 lines.

// standard · 70 lines · production-ready
class BloomFilter {
  // Construct a filter sized for n expected items at FPR p.
  // We pick m and k from the closed-form optima derived in §6:
  //
  //     m* = -n · ln(p) / (ln 2)²       ← optimal bit count
  //     k* = (m / n) · ln 2              ← optimal hash count
  //
  // m* is the *minimum* number of bits that achieves p when k is
  // also chosen optimally. Round up — under-sizing gives worse FPR.
  constructor(n, p = 0.01) {
    if (p <= 0 || p >= 1) throw new Error('p must be in (0, 1)');
    if (!Number.isInteger(n) || n < 1) throw new Error('n must be a positive integer');
    const ln2 = Math.LN2;                                    // ≈ 0.6931
    this.m = Math.ceil(-n * Math.log(p) / (ln2 * ln2));        // ≈ 9.585·n at p=1%
    this.k = Math.max(1, Math.round((this.m / n) * ln2));      // ≈ 7 at p=1%
    this.n = n; this.p = p; this.count = 0;
    // Uint8Array packs 8 bits per byte. ceil(m/8) bytes covers m bits.
    this.data = new Uint8Array(Math.ceil(this.m / 8));
  }

  // Compute the k bit positions for an item using Kirsch-Mitzenmacher.
  // Identical to the standalone getPositions() above, inlined here so
  // the class is self-contained and one copy of murmur3 is reused.
  _positions(item) {
    const s = String(item);
    const h1 = murmur3(s, 0x9747b28c);
    const h2 = murmur3(s, 0xe17a1465) | 1;            // odd → coprime to m
    const r = new Array(this.k);
    for (let i = 0; i < this.k; i++) {
      r[i] = ((h1 + i * h2) % this.m + this.m) % this.m;
    }
    return r;
  }

  // Bit-level set/test on a packed byte array. The math:
  //   byte index  = i ÷ 8  (i >> 3 — fast integer division)
  //   bit offset  = i mod 8 (i & 7 — fast mod for power-of-2)
  //   bit mask    = 1 << offset
  // Saves 8× memory vs. a boolean array; ~3× faster than indexing
  // into Array<bool> thanks to typed-array memory layout.
  _set(i)  { this.data[i >>> 3] |=  (1 << (i & 7)); }
  _test(i) { return (this.data[i >>> 3] &  (1 << (i & 7))) !== 0; }

  // Insert: set all k bits. Idempotent — adding the same item twice
  // is a no-op for membership but bumps `count` (a saturating
  // approximation; for exact size, use a Set alongside).
  add(item) {
    const ps = this._positions(item);
    for (const p of ps) this._set(p);
    this.count++;
    return this;
  }

  // Membership test — short-circuits on the first 0 bit.
  // Returns:  false → definitely NOT in set (zero false negatives)
  //           true  → probably in set, FPR = (1 - e^(-kn/m))^k
  has(item) {
    const ps = this._positions(item);
    for (const p of ps) if (!this._test(p)) return false;
    return true;
  }

  // Empirical fill rate — used to detect when a filter is over-saturated.
  // Theory: at the optimal k*, fill rate stabilizes at exactly 50%.
  // If you see > 70%, your n was too small or your p too aggressive.
  fillRate() {
    let n = 0;
    for (const b of this.data) {
      // Brian Kernighan's popcount: x &= x-1 clears the lowest set
      // bit; loop runs once per set bit, not 8 times. Way faster
      // than checking each bit individually for sparse bytes.
      let x = b;
      while (x) { n++; x &= x - 1; }
    }
    return n / this.m;
  }

  // Set union via byte-wise OR. Works because:
  //   bit_i(A ∪ B) = bit_i(A) OR bit_i(B)
  // for every position. Both filters MUST have the same (m, k) and
  // use identical hashes — otherwise the bit positions don't align
  // across filters and the result is meaningless.
  //
  // Common use: shard inserts across N machines, OR-merge their
  // filters into one summary at query time. Bandwidth is O(m/8) bytes
  // regardless of how many items each shard saw.
  merge(other) {
    if (this.m !== other.m || this.k !== other.k) {
      throw new Error('Incompatible filters: same m and k required');
    }
    for (let i = 0; i < this.data.length; i++) {
      this.data[i] |= other.data[i];
    }
    return this;
  }

  // Wire format: just the raw bits, base64'd. The receiver needs to
  // know n and p separately (transmit alongside, or pre-agree on
  // shape). For richer formats, encode m+k+data into a binary header.
  toBase64() {
    return btoa(String.fromCharCode(...this.data));
  }

  static fromBase64(b64, n, p) {
    const bf = new BloomFilter(n, p);
    const raw = atob(b64);
    if (raw.length !== bf.data.length) throw new Error('Size mismatch');
    for (let i = 0; i < raw.length; i++) bf.data[i] = raw.charCodeAt(i);
    return bf;
  }
}
✦ Run it in your browser
Click to add 1,000 items, query 10,000 unseen, measure FPR.

3 — CountingBloomFilter (deletion-capable)

Replaces each bit with a 4-bit counter. add increments k counters, delete decrements them — but only after verifying the item was actually added. Decrementing on a false-positive query would corrupt the filter.

Never use has() to gate a deletion. False positives mean has() returning true is no proof of membership. Always maintain a separate ground-truth record (a Set, a database column) for confirmed members and only delete from that.
// counting · ~75 lines · 4× memory cost
class CountingBloomFilter {
  // Same m/k formulas as the standard filter — but each "bit" is now
  // a 4-bit counter, so the storage cost is 4× larger. Trade memory
  // for the ability to delete (the standard filter cannot delete:
  // clearing a bit might also clear it for some other item).
  constructor(n, p = 0.01) {
    const ln2 = Math.LN2;
    this.m = Math.ceil(-n * Math.log(p) / (ln2 * ln2));
    this.k = Math.max(1, Math.round((this.m / n) * ln2));
    this.n = n; this.p = p;
    // Pack 2 counters per byte (4 bits each → 16 levels: 0…15).
    // ceil(m/2) bytes. So 4× memory vs. a standard bloom filter.
    this.data = new Uint8Array(Math.ceil(this.m / 2));
  }

  _positions(item)      { return getPositions(item, this.k, this.m); }

  // Read the i-th 4-bit counter. Even i → low nibble, odd → high.
  //   byte index = i >> 1 (i ÷ 2)
  //   shift      = (i & 1) * 4 (0 for even, 4 for odd)
  //   mask       = 0xF (lowest 4 bits of the shifted byte)
  _get(i) { return (this.data[i >> 1] >>> ((i & 1) * 4)) & 0xF; }

  // Write the i-th counter. Three steps: clear the target nibble,
  // mask v to 4 bits, OR it back in. Don't disturb the other nibble.
  _set(i, v) {
    const b = i >> 1, s = (i & 1) * 4;
    this.data[b] = (this.data[b] & ~(0xF << s)) | ((v & 0xF) << s);
  }

  // Increment k counters on insert.
  // Counter saturates at 15 — once there, it can never decrement.
  // This sounds bad but is actually crucial for correctness: if we
  // wrapped around (15 → 0), a later remove() could falsely zero a
  // counter that's still genuinely held by other items. Sticky
  // saturation just means the filter can't shrink as much, but
  // false negatives are still impossible.
  add(item) {
    for (const p of this._positions(item)) {
      const v = this._get(p);
      if (v < 15) this._set(p, v + 1);
    }
  }

  // CRITICAL: only call after independently verifying the item was
  // really added. has() returning true is NOT proof — that's a
  // probabilistic statement. If you decrement based on a false
  // positive, you corrupt other items' counters. Always pair this
  // with an authoritative source (DB row, Set, etc.).
  remove(item) {
    for (const p of this._positions(item)) {
      const v = this._get(p);
      // Two guards: don't underflow (0 → 15 wraparound) and don't
      // touch saturated counters (we don't know their true value).
      if (v > 0 && v < 15) this._set(p, v - 1);
    }
  }

  // "Definitely absent" iff any counter is zero. Identical
  // membership semantics to a standard filter — same FPR formula,
  // zero false negatives.
  has(item) {
    for (const p of this._positions(item)) {
      if (this._get(p) === 0) return false;
    }
    return true;
  }
}

4 — ScalableBloomFilter (unknown n)

Maintains a list of standard filters, each larger and tighter than the last. When the active filter exceeds its fill threshold, append a new one. Lookups check all filters; writes go to the last one only.

Total FPR is bounded by the geometric series p₀/(1−r) where r < 1 is the per-filter tightening ratio. With r=0.9 and p₀=0.001, total FPR stays under 0.01.

// scalable · ~50 lines · works without a known n
class ScalableBloomFilter {
  // Almeida et al. (2007). Solve the "n is unknown" problem by
  // chaining filters: when one fills up, freeze it and start a
  // bigger, tighter one. Membership = OR across all filters.
  //
  // Key insight: total FPR is bounded by a geometric series.
  //   p_total ≤ p₀ · (1 - r)⁻¹      with r < 1
  // E.g. p₀ = 0.001, r = 0.9 → p_total ≤ 0.01.
  // The first filter is sized for `initialN` and FPR `p₀ = p·(1-r)`,
  // so even after infinite chaining, total FPR stays at p.
  constructor(initialN = 1000, p = 0.01, growthFactor = 2, tightening = 0.9) {
    this.r = tightening;            // each new filter targets r× the previous FPR
    this.growth = growthFactor;     // each new filter has ×growth more capacity
    this.threshold = 0.5;            // rotate when fill exceeds this — matches optimal-k fill
    // First sub-filter sized for the geometric-series bound to hold.
    this.filters = [new BloomFilter(initialN, p * (1 - tightening))];
  }

  // Insert into the most recent (active) filter. If it's saturated,
  // freeze it and append a new one. Older filters are read-only.
  add(item) {
    let cur = this.filters[this.filters.length - 1];
    if (cur.fillRate() > this.threshold) {
      cur = new BloomFilter(cur.n * this.growth, cur.p * this.r);
      this.filters.push(cur);
    }
    cur.add(item);
  }

  // Lookup must check ALL sub-filters — item may have been added
  // to any of them. O(number-of-sub-filters) work per query, but
  // that count grows logarithmically with total size.
  has(item) {
    for (const f of this.filters) if (f.has(item)) return true;
    return false;
  }

  get totalCount() {
    return this.filters.reduce((s, f) => s + f.count, 0);
  }
}

5 — PartitionedBloomFilter

Divides m bits into k equal partitions; hash function i can only set a bit in partition i. Guarantees that exactly k bits are set per item (a standard filter sets fewer when two hashes collide). FPR is essentially identical to a standard filter for large m; the practical advantage is cache-line alignment per partition.

// partitioned · ~40 lines
class PartitionedBloomFilter {
  // Same total size m as a standard filter, but split into k equal
  // strips. Hash function i can ONLY set a bit in strip i.
  //
  // Why partition?
  //   1. Predictability: every insert touches exactly k bits (a
  //      standard filter sets fewer when two hashes collide on the
  //      same bit, which complicates fill-rate accounting).
  //   2. Cache locality: if a partition fits in one cache line
  //      (≤ 512 bits = 64 B), all bits for one item land in 1-k
  //      cache lines, predictably.
  //   3. Easier to parallelise: each strip is independent.
  constructor(n, p = 0.01) {
    const ln2 = Math.LN2;
    const mTotal = Math.ceil(-n * Math.log(p) / (ln2 * ln2));
    this.k = Math.max(1, Math.round((mTotal / n) * ln2));
    this.partitionSize = Math.ceil(mTotal / this.k);  // bits per partition
    this.m = this.partitionSize * this.k;            // rounded total ≥ mTotal
    this.partitions = Array.from({length: this.k},
      () => new Uint8Array(Math.ceil(this.partitionSize / 8)));
  }

  // Returns k positions, each modulo partitionSize (NOT m).
  // Same Kirsch-Mitzenmacher trick as before.
  _hashes(item) {
    const s = String(item);
    const h1 = murmur3(s, 0x9747b28c);
    const h2 = murmur3(s, 0xe17a1465) | 1;
    return Array.from({length: this.k}, (_, i) =>
      ((h1 + i * h2) % this.partitionSize + this.partitionSize) % this.partitionSize);
  }

  add(item) {
    const hs = this._hashes(item);
    for (let i = 0; i < this.k; i++) {
      // Set bit hs[i] in partition i — bit-twiddle is identical
      // to the standard filter's _set, just scoped to one strip.
      this.partitions[i][hs[i] >>> 3] |= (1 << (hs[i] & 7));
    }
  }

  has(item) {
    const hs = this._hashes(item);
    for (let i = 0; i < this.k; i++) {
      if (!(this.partitions[i][hs[i] >>> 3] & (1 << (hs[i] & 7)))) return false;
    }
    return true;
  }
}

6 — CuckooFilter

A different data structure for the same problem. Stores small fingerprints in buckets; each item has two candidate buckets, related by an XOR of the fingerprint's hash. The XOR property is what lets you delete without ever storing the original key.

~3× more space-efficient than a counting filter at the same FPR, but worst-case insert can fail (rare with proper sizing — handle by rebuilding into a larger filter or expanding bucket size).

// cuckoo · ~110 lines · deletion + space-efficient
class CuckooFilter {
  // Fan, Andersen, Kaminsky, Mitzenmacher (2014). A radically
  // different design that beats bloom filters on space at the
  // same FPR while supporting O(1) deletes natively.
  //
  // Storage:  numBuckets × bucketSize × fpBits  bits.
  //   - 95% load factor possible (vs. 50% for optimal bloom).
  //   - Each item = one fpBits-bit fingerprint (default 8) — much
  //     smaller than storing the key, but enough to discriminate
  //     between items hashing to the same bucket.
  constructor(capacity = 10000, fpBits = 8, bucketSize = 4, maxKicks = 500) {
    this.bucketSize = bucketSize;
    this.fpBits = fpBits;
    this.maxKicks = maxKicks;
    // We aim for ≤ 95% load. Round up to a power of 2 so we can
    // mask instead of mod (much faster on hot paths).
    const needed = Math.ceil(capacity / bucketSize / 0.95);
    this.numBuckets = 1;
    while (this.numBuckets < needed) this.numBuckets *= 2;
    this.mask = this.numBuckets - 1;       // (b & mask) == (b mod numBuckets)
    this.fpMask = (1 << fpBits) - 1;
    // Each bucket holds bucketSize fingerprints. 0 represents
    // "empty" — so fingerprints must be non-zero.
    this.buckets = Array.from({length: this.numBuckets},
      () => new Array(bucketSize).fill(0));
  }

  // Compute an fpBits-bit fingerprint of the item.
  // We add 1 to ensure fp is in [1, fpMask], reserving 0 for empty.
  // If the raw hash modulo fpMask is already 0, that's fine — the
  // +1 shifts every value by one without affecting uniqueness.
  _fp(item) {
    const h = murmur3(String(item), 0xface);
    return (h % this.fpMask) + 1;
  }

  // Primary bucket: a different hash from _fp, masked to numBuckets.
  _b1(item) {
    return murmur3(String(item), 0xb0f) & this.mask;
  }

  // THE MAGIC: alternate bucket = b XOR hash(fp).
  // XOR is its own inverse, so applying _alt twice with the same
  // fp returns to the start: _alt(_alt(b, fp), fp) === b.
  // This is what lets us find an item's "other" bucket given only
  // ITS FINGERPRINT — we never store or recover the original key.
  // That's how cuckoo filters delete without remembering keys.
  _alt(b, fp) {
    return (b ^ murmur3(String(fp), 0xb0f)) & this.mask;
  }

  // Slot a fingerprint into the first empty position in bucket b.
  // O(bucketSize) — usually 4 — so effectively constant.
  _tryInsert(b, fp) {
    for (let i = 0; i < this.bucketSize; i++) {
      if (this.buckets[b][i] === 0) {
        this.buckets[b][i] = fp;
        return true;
      }
    }
    return false;
  }

  // Insert: try b1 first, then bAlt. If both are full, randomly
  // evict an existing fp and put it in ITS alt bucket. Repeat
  // until something fits or we hit maxKicks (rebuild signal).
  add(item) {
    const fp = this._fp(item);
    let b = this._b1(item);
    if (this._tryInsert(b, fp)) return true;
    let bAlt = this._alt(b, fp);
    if (this._tryInsert(bAlt, fp)) return true;

    // Both candidate buckets full — start a kick chain. Random
    // initial pick keeps load balanced. Each iteration evicts a
    // random fp from the current bucket and tries to relocate it
    // to ITS alternate. With 4-slot buckets and load ≤ 95%, kick
    // chains terminate in < 500 iterations with overwhelming prob.
    let curB = Math.random() < 0.5 ? b : bAlt;
    let curFp = fp;
    for (let i = 0; i < this.maxKicks; i++) {
      const slot = Math.floor(Math.random() * this.bucketSize);
      // Swap: kick the existing fp out, hold it in curFp.
      [this.buckets[curB][slot], curFp] = [curFp, this.buckets[curB][slot]];
      curB = this._alt(curB, curFp);
      if (this._tryInsert(curB, curFp)) return true;
    }
    // Failed → filter is "full". Caller should resize: allocate
    // a 2× CuckooFilter and re-insert all live fps from this one.
    return false;
  }

  // Membership: an item lives in either b1 or _alt(b1, fp). Check
  // both. Worst-case 2 × bucketSize = 8 fingerprint comparisons.
  // FPR = 2 · bucketSize / 2^fpBits (factor 2 = the two buckets).
  // With fpBits=8, bucketSize=4 → FPR ≈ 0.031 (3.1%).
  // Want better? Increase fpBits — every extra bit halves FPR.
  has(item) {
    const fp = this._fp(item), b = this._b1(item);
    if (this.buckets[b].includes(fp)) return true;
    return this.buckets[this._alt(b, fp)].includes(fp);
  }

  // Delete: clear the first matching fp from either candidate bucket.
  // Caveat: with collisions in fingerprint space, you might delete
  // a DIFFERENT item that happened to have the same fp — but that's
  // already accounted for in the FPR (it just means a future query
  // for the original sees "absent", a normal false negative would
  // not happen for items genuinely added… EXCEPT after this delete).
  // Bottom line: as with counting filter, only delete items you've
  // verified independently.
  remove(item) {
    const fp = this._fp(item);
    for (const b of [this._b1(item), 0]) {
      const bb = b === 0 ? this._alt(this._b1(item), fp) : b;
      const idx = this.buckets[bb].indexOf(fp);
      if (idx !== -1) {
        this.buckets[bb][idx] = 0;
        return true;
      }
    }
    return false;
  }
}

7 — RotatingBloomFilter (sliding window)

For time-windowed deduplication. Two filters alternate active/passive on a rotation schedule. Lookups check both. Memory is always exactly 2× a single filter; any item is "remembered" for at least one window and at most two.

// rotating · ~30 lines · sliding window dedup
class RotatingBloomFilter {
  // Sliding-window dedup: "have I seen this in the last N minutes?"
  // Two filters tag-team. New items go to active; lookups check
  // both. When the window expires, active becomes previous and
  // previous is dropped, giving a clean filter to fill.
  //
  // Memory cost: 2× a single filter — constant regardless of
  // throughput.
  // Retention: any item is remembered for at least one window
  // (just inserted) and at most two (inserted right before rotation).
  constructor(windowMs, n, p = 0.01) {
    this.windowMs = windowMs;
    this.n = n;
    this.p = p;
    this.active = new BloomFilter(n, p);
    this.previous = new BloomFilter(n, p);
    this.rotatedAt = Date.now();
  }

  // Lazy rotation: fires on first call after windowMs elapses.
  // No background timer needed — works in any sync runtime.
  // Trade-off: a quiet filter rotates only on the next visit.
  // For strict timing, schedule rotation via setInterval and call
  // _maybeRotate() from there too.
  _maybeRotate() {
    if (Date.now() - this.rotatedAt > this.windowMs) {
      this.previous = this.active;
      this.active = new BloomFilter(this.n, this.p);
      this.rotatedAt = Date.now();
    }
  }

  add(item) {
    this._maybeRotate();
    this.active.add(item);          // writes only ever go to active
  }

  // Lookup checks BOTH filters: an item added in the previous
  // window is still "remembered" until that filter ages out.
  has(item) {
    this._maybeRotate();
    return this.active.has(item) || this.previous.has(item);
  }
}

8 — Test harness: measureFPR

The most useful 20 lines you'll write. Insert n items, query 100,000 you didn't insert, count "maybe present" responses. Measured FPR should land within ±5% of theoretical for a well-sized filter.

// harness · ~25 lines · always run before deploying
// The single most useful diagnostic for a bloom filter. Theory
// says FPR ≈ p; reality often differs because of hash quality,
// non-random keys, or under-sized m. Always run before deploying.
//
// How it works:
//   Phase 1: insert n DISTINCT items.
//   Phase 2: query `samples` items GUARANTEED to not be inserted.
//   Every "true" return is a false positive (zero false negatives
//   are possible by definition). Divide by samples → measured FPR.
//
// Convention: name inserted items "item:0", "item:1", … and probe
// items "notitem:0", "notitem:1", … so namespaces never overlap.
function measureFPR(BFClass, n, p, samples = 100_000) {
  const bf = new BFClass(n, p);

  // Phase 1: warm up the filter to the operating point.
  for (let i = 0; i < n; i++) bf.add(`item:${i}`);

  // Phase 2: 100k probes is enough to measure 1% FPR within ±10%
  // (95% CI). For p ≤ 0.01 you may want samples ≥ 1M for a clean
  // estimate.
  let fp = 0;
  for (let i = 0; i < samples; i++) {
    if (bf.has(`notitem:${i}`)) fp++;
  }

  const measured = fp / samples;
  const ratio = measured / p;
  return {
    measured,
    theoretical: p,
    ratio,
    // 0.5 ≤ ratio ≤ 2.0 is "well within Monte Carlo noise". A
    // ratio > 2 reliably signals a problem: bad hash, wrong m,
    // or correlated input keys. A ratio < 0.5 usually means an
    // under-filled filter (n_actual << n).
    pass: ratio >= 0.5 && ratio <= 2.0,
    m: bf.m,
    k: bf.k,
  };
}

// Smoke test for the standard filter — should print PASS.
const r = measureFPR(BloomFilter, 10000, 0.01);
console.log(`measured ${(r.measured*100).toFixed(3)}% vs ${(r.theoretical*100).toFixed(3)}% (ratio ${r.ratio.toFixed(2)}) — ${r.pass ? 'PASS' : 'FAIL'}`);
✦ Run all six filter types side-by-side
Click to benchmark Standard, Counting, Partitioned, Scalable, Cuckoo, and Rotating filters. 5,000 items inserted and 50,000 absent items queried. Reports measured FPR vs the 1.00% theoretical target plus runtime.

Production notes

  • Hash quality matters. The implementations above use MurmurHash3. xxHash64 would be faster on 64-bit hardware; both have excellent distribution. Avoid CRC32, FNV-1a, or anything cryptographic.
  • Pre-size if possible. Estimate n on the high side. A filter at fill rate 0.5 gives optimal FPR; at 0.7+ FPR climbs steeply.
  • Cache locality. For very large filters, reorganize so positions accessed for a single item all fall within one cache line (64 bytes / 512 bits). Partitioned filters do this naturally if you size each partition to ≤ 512 bits.
  • Concurrency. A standard filter is monotonic — adding never clears bits. That makes add safe to run concurrently from multiple threads (lost-update is acceptable when both updates set the same bit). Counting and cuckoo filters are not monotonic and need locks.
  • Always benchmark. The gap between theory and practice for hashing speed alone can be 10×. Use the harness above before deploying.