
A repo to collect a list of useful tools for data analysis

A repo to collect a list of useful tools and resources for data analysis


1.1) WGS/WXS community reference samples for benhcmarking:

A resource of paired tumor-normal reference samples profiled by whole-genome (WGS) and whole-exome sequencing (WES) data using sixteen library protocols, seven sequencing platforms at six different centers. ref and ref

Reference files used by the GDC data harmonization and generation pipelines to allow reproduction of GDC pipeline analyses

2) single-cell omics

2.1) MIDAS

Mosaic Integration and Knowledge Transfer, is a deep probabilistic framework designed for the integration of single-cell datasets generated by multiple omics technologies. Key functionalities of MIDAS include:Dimensionality Reduction, Batch Correction, Self-Supervised Modality Alignment and Information-Theoretic Latent Disentanglement. Nature Biotechnology , Jan,2024

2.2) SEACells

Infere metacells from scRNA Nat Biotechnology,2023

3) Transcriptome

4) Epigenome

5) General

5.1) causalnex:

CausalNex is a Python library that uses Bayesian Networks to combine machine learning and domain expertise for causal reasoning. You can use CausalNex to uncover structural relationships in your data, learn complex distributions, and observe the effect of potential interventions.