genome-analysis

There are 164 repositories under genome-analysis topic.

  • biocommons/hgvs

    Python library to parse, format, validate, normalize, and map sequence variants. `pip install hgvs`

    Language:Python2322162493
  • marbl/Winnowmap

    Long read / genome alignment software

    Language:C226154421
  • mbhall88/rasusa

    Randomly subsample sequencing reads

    Language:Rust19283917
  • MAGICS-LAB/DNABERT_2

    [ICLR 2024] DNABERT-2: Efficient Foundation Model and Benchmark for Multi-Species Genome

    Language:Shell18488342
  • intervene

    asntech/intervene

    Intervene: a tool for intersection and visualization of multiple genomic region and gene sets

    Language:Python12965928
  • EarlGrey

    TobyBaril/EarlGrey

    Earl Grey: A fully automated TE curation and annotation pipeline

    Language:Python11957817
  • mcveanlab/mccortex

    De novo genome assembly and multisample variant calling

    Language:C111159125
  • fmalmeida/bacannot

    Generic but comprehensive pipeline for prokaryotic genome annotation and interrogation with interactive reports and shiny app.

    Language:Nextflow964679
  • kishwarshafin/helen

    H.E.L.E.N. (Homopolymer Encoded Long-read Error-corrector for Nanopore)

    Language:Python686279
  • ganlab/GALA

    Long-reads Gap-free Chromosome-scale Assembler

    Language:Python6743314
  • lynnlangit/TeamTeri

    Bioinformatics on GCP, AWS or Azure

    Language:Shell5211216
  • COGclassifier

    moshi4/COGclassifier

    A tool for classifying prokaryote protein sequences into COG(Cluster of Orthologous Genes) functional category

    Language:Python41216
  • biocommons/biocommons.seqrepo

    non-redundant, compressed, journalled, file-based storage for biological sequences

    Language:Python39910334
  • RawHash

    CMU-SAFARI/RawHash

    RawHash is the first mechanism that can accurately and efficiently map raw nanopore signals to large reference genomes (e.g., a human reference genome) in real-time without using powerful computational resources (e.g., GPUs). Described by Firtina et al. (published at https://academic.oup.com/bioinformatics/article/39/Supplement_1/i297/7210440)

    Language:C39844
  • CMU-SAFARI/BLEND

    BLEND is a mechanism that can efficiently find fuzzy seed matches between sequences to significantly improve the performance and accuracy while reducing the memory space usage of two important applications: 1) finding overlapping reads and 2) read mapping. Described by Firtina et al. (published in NARGAB https://doi.org/10.1093/nargab/lqad004)

    Language:C341253
  • ilarsf/gwasTools

    Basic and fast GWAS functions for QQ and Manhattan plots (incl. gene names)

    Language:R29357
  • codex

    brandonsaldan/codex

    A minimal genetic data explorer that processes all genetic information locally.

    Language:JavaScript28311
  • maxATAC

    MiraldiLab/maxATAC

    Transcription Factor Binding Prediction from ATAC-seq and scATAC-seq with Deep Neural Networks

    Language:Python265757
  • pievos101/PopGenome

    An Efficient Swiss Army Knife for Population Genomic Analyses in R

    Language:R251123
  • Zhuxitong/ppsPCP

    A Plant Presence/absence Variants Scanner and Pan-genome Construction Pipeline

    Language:Perl251614
  • NBChub/bgcflow

    Snakemake workflow for the analysis of biosynthetic gene clusters across large collections of genomes (pangenomes)

    Language:Python2422007
  • pmenzel/score-assemblies

    Snakemake workflow for scoring and comparing multiple bacterial genome assemblies (Illumina, Nanopore) to reference genome(s)

    Language:Python23391
  • cccnrc/plot-VCF

    visual analysis of your VCF files

    Language:R22183
  • biocommons/bioutils

    provides common tools and lookup tables used primarily by the hgvs and uta packages

    Language:Python1953318
  • Genometric/MSPC

    Using combined evidence from replicates to evaluate ChIP-seq peaks

    Language:C#1942910
  • robinvanderlee/positive-selection

    Scripts and procedures for detecting positively selected genes and codons in primates

    Language:Perl192016
  • wiedenhoeft/HaMMLET

    Fast Bayesian Hidden Markov Model with Wavelet Compression

    Language:C++17434
  • RayDebashree/metaUSAT

    metaUSAT is a data-adaptive statistical approach for testing genetic associations of multiple traits from single/multiple studies using univariate GWAS summary statistics.

    Language:R16504
  • CodonU

    SouradiptoC/CodonU

    A python project for analysis of codon usage for gene or genome analysis

    Language:Python16361
  • leaemiliepradier/PlasForest

    A random forest classifier to identify contigs of plasmid origin in contig and scaffold genomes

    Language:Python151166
  • WGLab/NanoRepeat

    NanoRepeat: fast and accurate analysis of Short Tandem Repeats (STRs) from Oxford Nanopore sequencing data

    Language:Python151081
  • AmpliconSuite/AmpliconReconstructorOM

    Reconstructs complex variation using Bionano optical mapping data and breakpoint graph data

    Language:Python14366
  • CLASS_2021

    boulderrinnlab/CLASS_2021

    Statistical and computational analysis of the human genome

    Language:R14407
  • biocommons/anyvar

    [in development] Proof-of-Concept variation translation, validation, and registration service

    Language:Python118546
  • CMU-SAFARI/Genome-on-Diet

    Genome-on-Diet is a fast and memory-frugal framework for exemplifying sparsified genomics for read mapping, containment search, and metagenomic profiling. It is much faster & more memory-efficient than minimap2 for Illumina, HiFi, and ONT reads. Described by Alser et al. (preliminary version: https://arxiv.org/abs/2211.08157).

    Language:Roff11714
  • genenotebook/genenotebook

    A collaborative notebook for genes and genomes

    Language:JavaScript1153811