genome-analysis
There are 164 repositories under genome-analysis topic.
biocommons/hgvs
Python library to parse, format, validate, normalize, and map sequence variants. `pip install hgvs`
marbl/Winnowmap
Long read / genome alignment software
mbhall88/rasusa
Randomly subsample sequencing reads
MAGICS-LAB/DNABERT_2
[ICLR 2024] DNABERT-2: Efficient Foundation Model and Benchmark for Multi-Species Genome
asntech/intervene
Intervene: a tool for intersection and visualization of multiple genomic region and gene sets
TobyBaril/EarlGrey
Earl Grey: A fully automated TE curation and annotation pipeline
mcveanlab/mccortex
De novo genome assembly and multisample variant calling
fmalmeida/bacannot
Generic but comprehensive pipeline for prokaryotic genome annotation and interrogation with interactive reports and shiny app.
kishwarshafin/helen
H.E.L.E.N. (Homopolymer Encoded Long-read Error-corrector for Nanopore)
ganlab/GALA
Long-reads Gap-free Chromosome-scale Assembler
lynnlangit/TeamTeri
Bioinformatics on GCP, AWS or Azure
moshi4/COGclassifier
A tool for classifying prokaryote protein sequences into COG(Cluster of Orthologous Genes) functional category
biocommons/biocommons.seqrepo
non-redundant, compressed, journalled, file-based storage for biological sequences
CMU-SAFARI/RawHash
RawHash is the first mechanism that can accurately and efficiently map raw nanopore signals to large reference genomes (e.g., a human reference genome) in real-time without using powerful computational resources (e.g., GPUs). Described by Firtina et al. (published at https://academic.oup.com/bioinformatics/article/39/Supplement_1/i297/7210440)
CMU-SAFARI/BLEND
BLEND is a mechanism that can efficiently find fuzzy seed matches between sequences to significantly improve the performance and accuracy while reducing the memory space usage of two important applications: 1) finding overlapping reads and 2) read mapping. Described by Firtina et al. (published in NARGAB https://doi.org/10.1093/nargab/lqad004)
ilarsf/gwasTools
Basic and fast GWAS functions for QQ and Manhattan plots (incl. gene names)
brandonsaldan/codex
A minimal genetic data explorer that processes all genetic information locally.
MiraldiLab/maxATAC
Transcription Factor Binding Prediction from ATAC-seq and scATAC-seq with Deep Neural Networks
pievos101/PopGenome
An Efficient Swiss Army Knife for Population Genomic Analyses in R
Zhuxitong/ppsPCP
A Plant Presence/absence Variants Scanner and Pan-genome Construction Pipeline
NBChub/bgcflow
Snakemake workflow for the analysis of biosynthetic gene clusters across large collections of genomes (pangenomes)
pmenzel/score-assemblies
Snakemake workflow for scoring and comparing multiple bacterial genome assemblies (Illumina, Nanopore) to reference genome(s)
cccnrc/plot-VCF
visual analysis of your VCF files
biocommons/bioutils
provides common tools and lookup tables used primarily by the hgvs and uta packages
Genometric/MSPC
Using combined evidence from replicates to evaluate ChIP-seq peaks
robinvanderlee/positive-selection
Scripts and procedures for detecting positively selected genes and codons in primates
wiedenhoeft/HaMMLET
Fast Bayesian Hidden Markov Model with Wavelet Compression
RayDebashree/metaUSAT
metaUSAT is a data-adaptive statistical approach for testing genetic associations of multiple traits from single/multiple studies using univariate GWAS summary statistics.
SouradiptoC/CodonU
A python project for analysis of codon usage for gene or genome analysis
leaemiliepradier/PlasForest
A random forest classifier to identify contigs of plasmid origin in contig and scaffold genomes
WGLab/NanoRepeat
NanoRepeat: fast and accurate analysis of Short Tandem Repeats (STRs) from Oxford Nanopore sequencing data
AmpliconSuite/AmpliconReconstructorOM
Reconstructs complex variation using Bionano optical mapping data and breakpoint graph data
boulderrinnlab/CLASS_2021
Statistical and computational analysis of the human genome
biocommons/anyvar
[in development] Proof-of-Concept variation translation, validation, and registration service
CMU-SAFARI/Genome-on-Diet
Genome-on-Diet is a fast and memory-frugal framework for exemplifying sparsified genomics for read mapping, containment search, and metagenomic profiling. It is much faster & more memory-efficient than minimap2 for Illumina, HiFi, and ONT reads. Described by Alser et al. (preliminary version: https://arxiv.org/abs/2211.08157).
genenotebook/genenotebook
A collaborative notebook for genes and genomes