Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.