genome-analysis

There are 257 repositories under genome-analysis topic.

  • MAGICS-LAB/DNABERT_2

    [ICLR 2024] DNABERT-2: Efficient Foundation Model and Benchmark for Multi-Species Genome

    Language:Shell2941011967
  • marbl/Winnowmap

    Long read / genome alignment software

    Language:C266154723
  • biocommons/hgvs

    Python library to parse, format, validate, normalize, and map sequence variants. `pip install hgvs`

    Language:Python2482063094
  • mbhall88/rasusa

    Randomly subsample sequencing reads or alignments

    Language:Rust21664017
  • EarlGrey

    TobyBaril/EarlGrey

    Earl Grey: A fully automated TE curation and annotation pipeline

    Language:Python150612920
  • intervene

    asntech/intervene

    Intervene: a tool for intersection and visualization of multiple genomic region and gene sets

    Language:Python13666128
  • mcveanlab/mccortex

    De novo genome assembly and multisample variant calling

    Language:C113159125
  • fmalmeida/bacannot

    Generic but comprehensive pipeline for prokaryotic genome annotation and interrogation with interactive reports and shiny app.

    Language:Nextflow1004719
  • ganlab/GALA

    Long-reads Gap-free Chromosome-scale Assembler

    Language:Python7643317
  • kishwarshafin/helen

    H.E.L.E.N. (Homopolymer Encoded Long-read Error-corrector for Nanopore)

    Language:Python706279
  • COGclassifier

    moshi4/COGclassifier

    A tool for classifying prokaryote protein sequences into COG(Cluster of Orthologous Genes) functional category

    Language:Python59216
  • lynnlangit/TeamTeri

    Bioinformatics on GCP, AWS or Azure

    Language:Shell5311217
  • RawHash

    CMU-SAFARI/RawHash

    RawHash can accurately and efficiently map raw nanopore signals to reference genomes of varying sizes (e.g., from viral to a human genomes) in real-time without basecalling. Described by Firtina et al. (published at https://academic.oup.com/bioinformatics/article/39/Supplement_1/i297/7210440).

    Language:C508105
  • CMU-SAFARI/BLEND

    BLEND is a mechanism that can efficiently find fuzzy seed matches between sequences to significantly improve the performance and accuracy while reducing the memory space usage of two important applications: 1) finding overlapping reads and 2) read mapping. Described by Firtina et al. (published in NARGAB https://doi.org/10.1093/nargab/lqad004)

    Language:C421264
  • biocommons/biocommons.seqrepo

    non-redundant, compressed, journalled, file-based storage for biological sequences

    Language:Python40911035
  • NBChub/bgcflow

    Snakemake workflow for the analysis of biosynthetic gene clusters across large collections of genomes (pangenomes)

    Language:Python3622149
  • pievos101/PopGenome

    An Efficient Swiss Army Knife for Population Genomic Analyses in R

    Language:R321165
  • ilarsf/gwasTools

    Basic and fast GWAS functions for QQ and Manhattan plots (incl. gene names)

    Language:R30357
  • cccnrc/plot-VCF

    visual analysis of your VCF files

    Language:R29193
  • codex

    brandonsaldan/codex

    A minimal genetic data explorer that processes all genetic information locally.

    Language:JavaScript28311
  • maxATAC

    MiraldiLab/maxATAC

    Transcription Factor Binding Prediction from ATAC-seq and scATAC-seq with Deep Neural Networks

    Language:Python276769
  • Zhuxitong/ppsPCP

    A Plant Presence/absence Variants Scanner and Pan-genome Construction Pipeline

    Language:Perl261614
  • linyuiz/zgtools

    zgtools: A pipeline that allows for the convenient acquisition of T2T (Telomere-to-Telomere) genomes.

    Language:HTML25213
  • LaborBerlin/score-assemblies

    Snakemake workflow for scoring and comparing multiple bacterial genome assemblies (Illumina, Nanopore) to reference genome(s).

    Language:Python24201
  • biocommons/bioutils

    provides common tools and lookup tables used primarily by the hgvs and uta packages

    Language:Python2153718
  • Genometric/MSPC

    Using combined evidence from replicates to evaluate ChIP-seq peaks

    Language:C#2043010
  • RayDebashree/metaUSAT

    metaUSAT is a data-adaptive statistical approach for testing genetic associations of multiple traits from single/multiple studies using univariate GWAS summary statistics.

    Language:R19504
  • robinvanderlee/positive-selection

    Scripts and procedures for detecting positively selected genes and codons in primates

    Language:Perl192016
  • CodonU

    SouradiptoC/CodonU

    A python project for analysis of codon usage for gene or genome analysis

    Language:Python19371
  • leaemiliepradier/PlasForest

    A random forest classifier to identify contigs of plasmid origin in contig and scaffold genomes

    Language:Python171166
  • WGLab/NanoRepeat

    NanoRepeat: fast and accurate analysis of Short Tandem Repeats (STRs) from Oxford Nanopore sequencing data

    Language:Python1711141
  • wiedenhoeft/HaMMLET

    Fast Bayesian Hidden Markov Model with Wavelet Compression

    Language:C++17433
  • AmpliconSuite/AmpliconReconstructorOM

    Reconstructs complex variation using Bionano optical mapping data and breakpoint graph data

    Language:Python16366
  • CLASS_2021

    boulderrinnlab/CLASS_2021

    Statistical and computational analysis of the human genome

    Language:R14407
  • biocommons/anyvar

    [in development] Proof-of-Concept variation translation, validation, and registration service

    Language:Python138626
  • wurmlab/genomicscourse

    For QMUL's Genome Bioinformatics MSc module BIO721P & SIB's Spring school in bioinfo & population genomics

    Language:HTML12108420