amsyuan

amsyuan's Stars

lucidrains/progen
Implementation and replication of ProGen, Language Modeling for Protein Generation, in Jax
Language:Python11117
katholt/srst2
Short Read Sequence Typing for Bacterial Pathogens
Language:Python12566
lc222/char-cnn-text-classification-tensorflow
Character-level Convolutional Networks for Text Classification论文仿真实现
Language:Python7238
ShanlinKe/COVID-19
Dissecting the Role of the Human Microbiome in COVID-19 via Metagenome-assembled Genomes
Language:R63
RasmussenLab/phamb
Downstream processing of VAMB binning for Viral Elucidation
Language:Python468
RasmussenLab/vamb
Variational autoencoder for metagenomic binning
Language:Python26547
dominicp6/ImmuneConstrainedVAE
Dozens of vaccines protecting against SARS-CoV-2 have now been approved for public use, yet there remains a high risk that the virus evolves to escape vaccine protection. This motivates the need for a new generation of vaccines that can protect against a wider gamut of a virus’s evolutionary accessible states, not just the currently circulating strains. Computational methods such as sequence generative models can play a critical role in mapping out this state space. In particular, they can be used to screen thousands of examples of viral proteins that might pose a high risk of vaccine escape. In this work, we take steps towards such a computational method by designing and evaluating a conditional Variational Autoencoder (VAE) capable of selectively generating SARS-CoV-2 spike proteins with low immune visibility. The model is trained on $65,000$ of the most common wild-type SARS-CoV-2 sequences and uses NetMHCpan to estimate levels of exposure to human T cell immunity. The model's generated sequences are compared with those derived from two simpler generative models; a random-mutator and an 11-gram language model. We discover that although all three models are able to generate stable, structurally valid sequences, only the VAE model can generate low immunogenicity sequences sampled from a distribution that interpolates smoothly along the principal variance directions of natural sequences.
Language:Jupyter Notebook2
salesforce/progen
Official release of the ProGen models
Language:Python627117
debbiemarkslab/EVcouplings
Evolutionary couplings from protein and RNA sequence alignments
Language:Jupyter Notebook24975
lambdal/deeplearning-benchmark
Benchmark Suite for Deep Learning
Language:Shell25250
tseemann/snippy
:scissors: :zap: Rapid haploid variant calling and core genome alignment
Language:Perl488117
biobakery/biobakery
bioBakery tools for meta'omic profiling
Language:Shell26475
asadprodhan/GPU-accelerated-guppy-basecalling
GPU-accelerated guppy basecalling and demultiplexing on Linux
Language:Shell173
gencorefacility/covid19
Variant Analysis Pipeline for COVID19
Language:Nextflow41
appliedmicrobiologyresearch/covgap
Genome mapping, consensus generating, variant calling and annotation tool for SARS-COV-2
Language:Python12
nextstrain/nextclade_data
Datasets for https://github.com/nextstrain/nextclade
Language:Python3228
hsnguyen/assembly
Streaming assembly for MinION data
Language:Java254
mdcao/npScarf
263
neherlab/treetime
Maximum likelihood inference of time stamped phylogenies and ancestral reconstruction
Language:Jupyter Notebook23257
theosanderson/chronumental
Estimating time trees from very large phylogenies
Language:Python256
nextstrain/ncov
Nextstrain build for novel coronavirus SARS-CoV-2
Language:Python1.4k403
PoonLab/covizu
Rapid analysis and visualization of coronavirus genome variation
Language:JavaScript4620
maximilianh/multiSub
Prepares a SARS-CoV-2 submission for GISAID, NCBI or ENA. Can read GISAID or NCBI files, or plain fasta+tsv/csv/xls. Finds files in input directory and merges everything into a single output directory. Auto-detects input file formats. Can submit the results to multiple repositories from the command line.
Language:Python352
cov-lineages/pangolin
Software package for assigning SARS-CoV-2 genome sequences to global lineages.
Language:Python430106
epi2me-labs/wf-artic
ARTIC SARS-CoV-2 workflow and reporting
Language:Nextflow4935
LPDI-EPFL/trivalent_cocktail
Language:Roff104
JianiC/rsv
Nextstrain build for Human Respiratory Syncytial Virus
Language:Python32
salvatoreloguercio/cov2vec
cov2vec is a systematic effort to obtain SARS CoV-2 genome embeddings by encoding viral genomes with protein language models.
Language:Python1
brianhie/viral-mutation
Language modeling of viral evolution
Language:Python13844
happywlu/CroTrait
A portable tool for in silico species identification, serotyping and multilocus sequence typing of Cronobacter genus
Language:Python34