Getting Error name 'isComplete' is not defined while running deequ code in Azure Databricks
dilkushpatel opened this issue · 4 comments
Ask questions that don't apply to the other templates (Bug report, Feature request)
I'm trying to implement basic checks on columns of table which is in SQL Azure DW
till reading data works fine
I can also run ConstraintSuggestionRunner
When I run VerificationSuite with single check isComplete its giving error
Error:
name 'isComplete' is not defined
Code:
import sagemaker_pyspark
import pydeequ
from pyspark.sql import SparkSession
from pydeequ.analyzers import *
from pydeequ.checks import *
from pydeequ.verification import *
from pydeequ.anomaly_detection import *
classpath = ":".join(sagemaker_pyspark.classpath_jars())
spark = (SparkSession
.builder
.config("spark.driver.extraClassPath", classpath)
.config("spark.jars.packages", pydeequ.deequ_maven_coord)
.config("spark.jars.excludes", pydeequ.f2j_maven_coord)
.getOrCreate())
check = Check(spark, CheckLevel.Error, "Data QC")
checkResult = VerificationSuite(spark)
.onData(df)
.addCheck(isComplete("month_id")).run()
checkResult_df = VerificationResult.checkResultsAsDataFrame(spark, checkResult)
checkResult_df.show()
tried google did not get anything relevant.
Same error with any other check as well.
Change
checkResult = VerificationSuite(spark)
.onData(df)
.addCheck(
isComplete("month_id")
)
.run()
to
checkResult = VerificationSuite(spark)
.onData(df)
.addCheck(
check.isComplete("month_id")
)
.run()
See full code example here: https://github.com/awslabs/python-deequ#constraint-verification
interesting!
I was actually trying that
still error though
Error:
Check.isComplete() missing 1 required positional argument: 'column'
Code:
checkResult = VerificationSuite(spark)
.onData(df)
.addCheck(Check.isComplete("month_id")).run()
Ignore...
changed Check to check and that worked.
Thanks.
Thanks for confirming.
Since you have the following line,
check = Check(spark, CheckLevel.Error, "Data QC")
check.isComplete
is correct as opposed to Check.isComplete