clarite

CLeaning to Analysis: Reproducibility-based Interface for Traits and Exposures


NOTE

There is now a python version of CLARITE that is more actively developed. See documentation here.


Overview

The goal of clarite is to guide a dataset from the “raw” data stage to EWAS analysis and subsequent visualization of results. The package is designed to lead a user through the stages of data cleaning: from generating descriptive statistics, to making QC decisions informed by the descriptive statistics, to running analyses on the filtered dataset and visualizing the results.

Installation

A development version of the package can be installed using devtools.

devtools::install_github('HallLab/clarite')

Example Workflow

The following image depicts a typical workflow for a project from raw data stage to analysis, in this case an Environment-Wide Association Study, and results visualization, all of which can be performed using the clarite package. The user starts with raw data and alternates filtering (dark boxes) or summary steps (light boxes) until it is sufficiently “cleaned” and in a stage where analyses can be run.

Image

Questions

If you have any questions not answered by the documentation, feel free to open an Issue.

Citing CLARITE

  1. Lucas AM, et al (2019) CLARITE facilitates the quality control and analysis process for EWAS of metabolic-related traits. Frontiers in Genetics: 10, 1240

  2. Passero K, et al (2020) Phenome-wide association studies on cardiovascular health and fatty acids considering phenotype quality control practices for epidemiological data. Pacific Symposium on Biocomputing: 25, 659