genome-analysis

There are 257 repositories under genome-analysis topic.

MAGICS-LAB/DNABERT_2
[ICLR 2024] DNABERT-2: Efficient Foundation Model and Benchmark for Multi-Species Genome
Language:Shell294 10 11967
marbl/Winnowmap
Long read / genome alignment software
Language:C266 15 4723
biocommons/hgvs
Python library to parse, format, validate, normalize, and map sequence variants. `pip install hgvs`
Language:Python248 20 63094
mbhall88/rasusa
Randomly subsample sequencing reads or alignments
Language:Rust216 6 4017
TobyBaril/EarlGrey
Earl Grey: A fully automated TE curation and annotation pipeline
Language:Python150 6 12920
asntech/intervene
Intervene: a tool for intersection and visualization of multiple genomic region and gene sets
Language:Python136 6 6128
mcveanlab/mccortex
De novo genome assembly and multisample variant calling
Language:C113 15 9125
fmalmeida/bacannot
Generic but comprehensive pipeline for prokaryotic genome annotation and interrogation with interactive reports and shiny app.
Language:Nextflow100 4 719
ganlab/GALA
Long-reads Gap-free Chromosome-scale Assembler
Language:Python76 4 3317
kishwarshafin/helen
H.E.L.E.N. (Homopolymer Encoded Long-read Error-corrector for Nanopore)
Language:Python70 6 279
moshi4/COGclassifier
A tool for classifying prokaryote protein sequences into COG(Cluster of Orthologous Genes) functional category
Language:Python59 2 16
lynnlangit/TeamTeri
Bioinformatics on GCP, AWS or Azure
Language:Shell53 11 217
CMU-SAFARI/RawHash
RawHash can accurately and efficiently map raw nanopore signals to reference genomes of varying sizes (e.g., from viral to a human genomes) in real-time without basecalling. Described by Firtina et al. (published at https://academic.oup.com/bioinformatics/article/39/Supplement_1/i297/7210440).
Language:C50 8 105
CMU-SAFARI/BLEND
BLEND is a mechanism that can efficiently find fuzzy seed matches between sequences to significantly improve the performance and accuracy while reducing the memory space usage of two important applications: 1) finding overlapping reads and 2) read mapping. Described by Firtina et al. (published in NARGAB https://doi.org/10.1093/nargab/lqad004)
Language:C42 12 64
biocommons/biocommons.seqrepo
non-redundant, compressed, journalled, file-based storage for biological sequences
Language:Python40 9 11035
NBChub/bgcflow
Snakemake workflow for the analysis of biosynthetic gene clusters across large collections of genomes (pangenomes)
Language:Python36 2 2149
pievos101/PopGenome
An Efficient Swiss Army Knife for Population Genomic Analyses in R
Language:R32 1 165
ilarsf/gwasTools
Basic and fast GWAS functions for QQ and Manhattan plots (incl. gene names)
Language:R30 3 57
cccnrc/plot-VCF
visual analysis of your VCF files
Language:R29 1 93
brandonsaldan/codex
A minimal genetic data explorer that processes all genetic information locally.
Language:JavaScript28 3 11
MiraldiLab/maxATAC
Transcription Factor Binding Prediction from ATAC-seq and scATAC-seq with Deep Neural Networks
Language:Python27 6 769
Zhuxitong/ppsPCP
A Plant Presence/absence Variants Scanner and Pan-genome Construction Pipeline
Language:Perl26 1 614
linyuiz/zgtools
zgtools: A pipeline that allows for the convenient acquisition of T2T (Telomere-to-Telomere) genomes.
Language:HTML25 2 13
LaborBerlin/score-assemblies
Snakemake workflow for scoring and comparing multiple bacterial genome assemblies (Illumina, Nanopore) to reference genome(s).
Language:Python24 2 01
biocommons/bioutils
provides common tools and lookup tables used primarily by the hgvs and uta packages
Language:Python21 5 3718
Genometric/MSPC
Using combined evidence from replicates to evaluate ChIP-seq peaks
Language:C#20 4 3010
RayDebashree/metaUSAT
metaUSAT is a data-adaptive statistical approach for testing genetic associations of multiple traits from single/multiple studies using univariate GWAS summary statistics.
Language:R19 5 04
robinvanderlee/positive-selection
Scripts and procedures for detecting positively selected genes and codons in primates
Language:Perl19 2 016
SouradiptoC/CodonU
A python project for analysis of codon usage for gene or genome analysis
Language:Python19 3 71
leaemiliepradier/PlasForest
A random forest classifier to identify contigs of plasmid origin in contig and scaffold genomes
Language:Python17 1 166
WGLab/NanoRepeat
NanoRepeat: fast and accurate analysis of Short Tandem Repeats (STRs) from Oxford Nanopore sequencing data
Language:Python17 11 141
wiedenhoeft/HaMMLET
Fast Bayesian Hidden Markov Model with Wavelet Compression
Language:C++17 4 33
AmpliconSuite/AmpliconReconstructorOM
Reconstructs complex variation using Bionano optical mapping data and breakpoint graph data
Language:Python16 3 66
boulderrinnlab/CLASS_2021
Statistical and computational analysis of the human genome
Language:R14 4 07
biocommons/anyvar
[in development] Proof-of-Concept variation translation, validation, and registration service
Language:Python13 8 626
wurmlab/genomicscourse
For QMUL's Genome Bioinformatics MSc module BIO721P & SIB's Spring school in bioinfo & population genomics
Language:HTML12 10 8420

genome-analysis

MAGICS-LAB/DNABERT_2

marbl/Winnowmap

biocommons/hgvs

mbhall88/rasusa

TobyBaril/EarlGrey

asntech/intervene

mcveanlab/mccortex

fmalmeida/bacannot

ganlab/GALA

kishwarshafin/helen

moshi4/COGclassifier

lynnlangit/TeamTeri

CMU-SAFARI/RawHash

CMU-SAFARI/BLEND

biocommons/biocommons.seqrepo

NBChub/bgcflow

pievos101/PopGenome

ilarsf/gwasTools

cccnrc/plot-VCF

brandonsaldan/codex

MiraldiLab/maxATAC

Zhuxitong/ppsPCP

linyuiz/zgtools

LaborBerlin/score-assemblies

biocommons/bioutils

Genometric/MSPC

RayDebashree/metaUSAT

robinvanderlee/positive-selection

SouradiptoC/CodonU

leaemiliepradier/PlasForest

WGLab/NanoRepeat

wiedenhoeft/HaMMLET

AmpliconSuite/AmpliconReconstructorOM

boulderrinnlab/CLASS_2021

biocommons/anyvar

wurmlab/genomicscourse