Pinned Repositories
airflow
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
awesome-bigdata
A curated list of awesome Big Data technologies, documentations and analytics tools.
CDM
The Common Data Model (CDM) is a standard and extensible collection of schemas (entities, attributes, relationships) that represents business concepts and activities with well-defined semantics, to facilitate data interoperability. Examples of entities include: Account, Contact, Lead, Opportunity, Product, etc.
charts
Curated applications for Kubernetes
compliance-trestle
An opinionated tooling platform for managing compliance as code, using continuous integration and NIST's OSCAL standard.
dbt-core
dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
deequ
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
dr-elephant
Dr. Elephant is a job and flow-level performance monitoring and tuning tool for Apache Hadoop and Apache Spark
egeria
Open Metadata and Governance
fides
Privacy as Code for your CI and runtime environment
mehdichara's Repositories
mehdichara/airflow
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
mehdichara/awesome-bigdata
A curated list of awesome Big Data technologies, documentations and analytics tools.
mehdichara/CDM
The Common Data Model (CDM) is a standard and extensible collection of schemas (entities, attributes, relationships) that represents business concepts and activities with well-defined semantics, to facilitate data interoperability. Examples of entities include: Account, Contact, Lead, Opportunity, Product, etc.
mehdichara/charts
Curated applications for Kubernetes
mehdichara/compliance-trestle
An opinionated tooling platform for managing compliance as code, using continuous integration and NIST's OSCAL standard.
mehdichara/dbt-core
dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
mehdichara/deequ
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
mehdichara/dr-elephant
Dr. Elephant is a job and flow-level performance monitoring and tuning tool for Apache Hadoop and Apache Spark
mehdichara/egeria
Open Metadata and Governance
mehdichara/fides
Privacy as Code for your CI and runtime environment
mehdichara/flaskapi
A sample Python/Flask API for the wine and cheese pairing app
mehdichara/incubator-gobblin
Gobblin is a distributed big data integration framework (ingestion, replication, compliance, retention) for batch and streaming systems. Gobblin features integrations with Apache Hadoop, Apache Kafka, Salesforce, S3, MySQL, Google etc.
mehdichara/kubernetes
Production-Grade Container Scheduling and Management
mehdichara/mkdocs
Project documentation with Markdown.
mehdichara/ontology-development-kit
Bootstrap an OBO Library ontology
mehdichara/spark-cassandra-connector
DataStax Spark Cassandra Connector
mehdichara/terraform-aws-iam
Terraform module which creates IAM resources on AWS
mehdichara/terraforming
Export existing AWS resources to Terraform style (tf, tfstate)
mehdichara/test-github-packages