Daqua is a data quality measurement tool. We can connect to any data source which contains tabular data to perform data profiling and various indicator for data quality for data cleaning.
To use daqua, you need to install it by using pip.
pip install daqua
You need to have numpy and pandas installed in your system for using daqua.
# importing the package
import daqua
# Creating an object of the Daqua class
dq = daqua.Daqua()
# reading an excel file
dq.read_excel(path_to_excel)
# get the dataframe dictionary
## The structure is like {"sheet_name" : df}
dict_dfs = dq.getDataFrames()
# get meta data about the dataframe
meta = dq.getMetaData()
# get detailed metadata
detailed_meta = dq.getDetailedMeta()
# get the descriptive stat for numeric columns
descriptive_stat_dict = dq.getNumericDesc()
# get quantile stats
quant_dict = dq.getQuantileStat()
link to pypi: https://pypi.org/project/daqua/
Please read [CONTRIBUTING.md](link to contri page) for details on our code of conduct, and the process for submitting pull requests to us.
- Satya Pati - Initial work - Github
- Lalit Moharana - Initial work - Github
- Soumya Ranjan Bisoi - Initial work - Github
See also the list of contributors who participated in this project.
This project is licensed under the MIT License - see the LICENSE.md file for details