/frequency_moments

Formal verification of randomized streaming algorithms for frequency moments.

Primary LanguageTeX

Frequency Moments

Isabelle is an interactive theorem prover, which can be used to verify mathematical theorems or the correctness of algorithms, protocols or hardware. This repository contains the formal verification of randomized, approximate space-efficient streaming algorithms for frequency moments with Isabelle/HOL.

Frequency moments can for example be used to determine the number of distinct elements, the skew of the rank-size distribution of the data stream and several statistical dispersion measures.

Summary of verified algorithms:

  • Approximation of F_0 with space complexity: O( ln(1/ε) (ln n + 1/δ² (ln (1/δ) + ln ln n)))
  • Approximation of F_2 with space complexity: O( ln(1/ε) (1/δ²) (ln n + ln m))
  • Approximation of F_k for k >= 3 with space complexity: O( ln(1/ε) (1/δ²) (ln n + ln m) k n^(1-1/k))

where

  • n is the size of the universe of the stream elments,
  • m is the length of the stream,
  • δ is the required relative accuracy (0 < δ < 1),
  • ε is the maximum failure probability (0 < ε < 1).

A more up to date version of this work is available at the Archive of Formal Proofs:

Verification

If the following is available and set up:

# Clone this repo
git clone https://github.com/ekarayel/frequency_moments.git

# Plain Verification (requires 30 min)
isabelle build -D frequency_moments/thys/

A guide for verification if Isabelle is not pre-installed is available here.