/Biotech-Research-Hub

a comprehensive repository designed to empower researchers, scientists, and enthusiasts in the fields of bioinformatics and biotechnology. This repository serves as a one-stop destination for a myriad of invaluable resources, ranging from insightful guides and cutting-edge research papers to informative podcasts, web tools, and efficient workflows

Welcome to the Bioinformatics and Biotechnology Resources Hub — a comprehensive repository designed to empower researchers, scientists, and enthusiasts in the fields of bioinformatics and biotechnology. This repository serves as a one-stop destination for a myriad of invaluable resources, ranging from insightful guides and cutting-edge research papers to informative podcasts, web tools, and efficient workflows.

As resources continue to grow, please use Ctrl-F to search for desired keywords. You can also use Ctrl - to shrink the screen and view more results.

Feel free to customize the hub according to your preferences and needs by forking the repository to your profile. If you wish to contribute to the maintenance or updates of the repository, please create a pull request.

Key Features:

  1. Workflow Managers (e.g., Nextflow): Streamline computational analyses with powerful workflow managers.
  2. Resources for Starting a Company: Tailored resources to navigate biotech entrepreneurship and kickstart ventures.
  3. Podcasts: Stay informed and inspired with expert discussions on the latest trends and breakthroughs.
  4. Informatic Tools/Languages/Workflows/Libraries: A comprehensive toolkit for bioinformatics projects, including languages and workflows.
  5. Models/Softwares/Pipelines: Implement cutting-edge models, software solutions, and pipelines for advanced research.
  6. Web-Tools: Curated online resources and tools to simplify and enhance bioinformatics workflows.
  7. Books: Diverse collection covering various bioinformatics and biotechnology topics.
  8. Blogs: Stay connected with industry trends, opinions, and insights through curated blog recommendations.
  9. Blog Articles: In-depth explorations of specific topics with practical tips and insights.
  10. Papers: An extensive archive of research papers spanning genomics, proteomics, and bioinformatics algorithms.
  11. Tutorials/Guides: Step-by-step tutorials and guides suitable for all skill levels.
  12. Wet Lab: Resources tailored to experimental research and methodologies in wet lab environments.
  13. Communities/Forums/Learning Platforms: Connect, collaborate, and learn through vibrant communities, forums, and dedicated platforms.

single cell rna seq

Title field Link
Orchestrating Single-Cell Analysis with Bioconductor cell https://bioconductor.org/books/release/OSCA/
Analysis of single cell RNA-seq data cell https://www.singlecellcourse.org/index.html
HGNC cell https://www.genenames.org/
Seurat cell https://satijalab.org/seurat/articles/get_started
Scanpy cell https://scanpy.readthedocs.io/en/stable/
single cell best practices cell https://www.sc-best-practices.org/preamble.html
Complete single-cell RNAseq analysis walkthrough - scanpy, Advanced introduction cell https://www.youtube.com/watch?v=uvyG9yLuNSE&t=2561s
How to analyze single-cell RNA-Seq data in R, Detailed Seurat Workflow Tutorial cell https://www.youtube.com/watch?v=5HBzgsz8qyk
cell
cell

workflow managers

Title field Link
Nextflow cell https://www.nextflow.io/, https://github.com/pmb59/bioinformatics-workflows/tree/master/nextflow, https://www.nextflow.io/docs/latest/index.html, https://nf-co.re/
Galaxy cell https://github.com/pmb59/bioinformatics-workflows/tree/master/galaxy , https://training.galaxyproject.org/training-material/topics/introduction/tutorials/galaxy-intro-101/tutorial.html, https://training.galaxyproject.org/training-material/, https://docs.galaxyproject.org/en/master/
Snakemake cell https://github.com/pmb59/bioinformatics-workflows/tree/master/snakemake, https://snakemake.readthedocs.io/en/stable/tutorial/tutorial.html, https://snakemake.readthedocs.io/en/stable/
SciPipe cell https://github.com/pmb59/bioinformatics-workflows/tree/master/scipipe, https://scipipe.org/writing_workflows/,https://scipipe.org/
GenPipes cell https://github.com/pmb59/bioinformatics-workflows/tree/master/genpipes, https://genpipes.readthedocs.io/en/master/tutorials/list_tutorials.html, https://genpipes.readthedocs.io/en/master/index.html
Bpipe cell https://github.com/pmb59/bioinformatics-workflows/tree/master/bpipe

Resources for starting a company

Title field Link
biotech resources cell https://github.com/crazyhottommy/biotech_resource
Companies Innovating in Biotech cell https://www.biotech2k.com/companies/companies.html
So You Want to Start a Biotech: A Bioinformatics Approach That Works A blog post by Michele Busby. cell https://michelebusby.tumblr.com/post/643211974587629568/so-you-want-to-start-a-biotech-a-bioinformatics
Nextflow cell https://medium.com/23andme-engineering/introduction-to-nextflow-4d0e3b6768d1
How Novo Nordisk built a modern data architecture on AWS cell https://aws.amazon.com/blogs/big-data/how-novo-nordisk-built-a-modern-data-architecture-on-aws/
zettlr cell https://zettlr.com/
Modern biotech data infrastructure cell http://blog.booleanbiotech.com/biotech-data-infrastructure.html
The digital biotech startup playbook cell https://medium.com/@jfeala/the-digital-biotech-startup-playbook-398aeafca8a4
cell
cell
cell

Library repos

Title field Link
Bancroft Library · Oral History Center · Projects; Bioscience and Biotechnology cell https://www.lib.berkeley.edu/visit/bancroft/oral-history-center/projects/bioscience
cell
cell
cell
cell
cell

podcasts

Title field Link
OncoPharm: latest developments in oncology cell https://podcasts.apple.com/us/podcast/oncopharm/id1305345744
the long run (luke timmerman) cell https://podcasts.apple.com/us/podcast/the-long-run-with-luke-timmerman/id1282838969
Mendelspod (diagnostics, genetics and genomic medicine cell https://mendelspod.com/
STAT’s weekly biotech podcast, breaking down the latest news, digging deep into industry goings-on cell https://www.statnews.com/category/readout-loud/
the bioinformatics chat cell https://open.spotify.com/show/1adLiZOHtLtrnx6MScTvAX
cell

Databases

Title field Link
uniprot cell
IEDB cell
uniref cell
interpro cell
PDB cell
Veupathdb cell
reactome pathway database https://reactome.org/
cell

Informatic Tools/Workflows/languages

Title field Link
Python cell https://www.python.org/
R cell https://www.r-project.org/
Bioconductor cell https://www.bioconductor.org/
PyTorch: Deep learning library for python cell https://pytorch.org/
D3: The JavaScript library for bespokedata visualization cell https://d3js.org/
How to make your research data more FAIR cell https://howtofair.dk/
cell
cell

Python libraries

Title field Link
iFeature: generation of protein descriptors cell https://github.com/Superzchen/iFeature
pandas cell
numpy cell
rdkit cell
pytorch cell
tensorflow cell
keras cell
scikit learn cell
cell
cell
cell
cell

R libraries

Title field Link
ggplate create simple plots of biological culture plates as well as microplates. https://github.com/jpquast/ggplate
Tidyverse collection of R packages designed for data science https://www.tidyverse.org/
ggplot2 graphics https://ggplot2.tidyverse.org/
caret set of functions that attempt to streamline the process for creating predictive models. https://topepo.github.io/caret/
patchwork combine separate ggplots into the same graphic https://patchwork.data-imaginist.com/
ggstatsplot Based Plots with Statistical Details https://indrajeetpatil.github.io/ggstatsplot/
ggextra collection of functions and layers to enhance ggplot2 https://cran.r-project.org/web/packages/ggExtra/vignettes/ggExtra.html
ggrepel geoms for ggplot2 to repel overlapping text labels https://ggrepel.slowkow.com/articles/examples.html
ggthemes Provides 'ggplot2' themes and scales that replicate the look of plots by Edward Tufte, Stephen Few, 'Fivethirtyeight', 'The Economist', 'Stata', 'Excel', and 'The Wall Street Journal', among others https://yutannihilation.github.io/allYourFigureAreBelongToUs/ggthemes/
clusterProfiler A universal enrichment tool for interpreting omics data https://bioconductor.org/packages/release/bioc/html/clusterProfiler.html, https://guangchuangyu.github.io/software/clusterProfiler/
pheatmap A function to draw clustered heatmaps. https://www.rdocumentation.org/packages/pheatmap/versions/1.0.12/topics/pheatmap
ComplexHeatmap visualize associations between different sources of data sets and reveal potential patterns. https://jokergoo.github.io/ComplexHeatmap-reference/book/
heatmaply similar heatmaps as d3heatmap, with the advantage of speed https://cran.r-project.org/web/packages/heatmaply/vignettes/heatmaply.html
bibliometrix nstrument to pursue a complete bibliometric analysis, following the Science Mapping Workflow. https://www.bibliometrix.org/home/
cowplot simple add-on to ggplot. It provides various features that help with creating publication-quality figures, https://cran.r-project.org/web/packages/cowplot/vignettes/introduction.html
enrichplot visualization methods to help interpreting enrichment results https://yulab-smu.top/biomedical-knowledge-mining-book/enrichplot.html, https://bioconductor.org/packages/release/bioc/html/enrichplot.html
bbmle Methods and functions for fitting maximum likelihood models https://cran.r-project.org/web/packages/bbmle/index.html
maaslin2 determining multivariable association between clinical metadata and microbial meta-omics features https://github.com/biobakery/Maaslin2
seurat single cell genomics, differential expression https://satijalab.org/seurat/
deseq2 Differential gene expression analysis based on the negative binomial distribution https://bioconductor.org/packages/release/bioc/html/DESeq2.html, https://lashlock.github.io/compbio/R_presentation.html, https://bioconductor.org/packages/devel/bioc/manuals/DESeq2/man/DESeq2.pdf, https://genviz.org/module-04-expression/0004/02/01/DifferentialExpression/, https://bioc.ism.ac.jp/packages/2.14/bioc/vignettes/DESeq2/inst/doc/beginner.pdf, https://med.und.edu/research/genomics-core/_files/docs/deseq2-handout.pdf, https://blog.bioturing.com/2022/06/02/the-basics-of-deseq2-a-powerful-tool-in-differential-expression-analysis-for-single-cell-rna-seq/, https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html, https://joey711.github.io/phyloseq-extensions/DESeq2.html
edgeR differential expression in RNA-Seq https://bioconductor.org/packages/release/bioc/html/edgeR.html, https://bioconductor.org/packages/devel/bioc/vignettes/Glimma/inst/doc/single_cell_edger.html
ggridges partially overlapping line plots that create the impression of a mountain range (changes in distributions over time or space) https://cran.r-project.org/web/packages/ggridges/vignettes/introduction.html
conflicted provide an alternative way of resolving conflicts caused by ambiguous function names https://www.tidyverse.org/blog/2018/06/conflicted/
ggpubr ‘ggplot2’ Based Publication Ready Plots https://rpkgs.datanovia.com/ggpubr/
forestplot allows for multiple confidence intervals per row, custom fonts for each text element, custom confidence intervals, text mixed with expressions, and more. https://cran.r-project.org/web/packages/forestplot/index.html

large protein language models

Title field Link
ESM2 cell
protT5 cell
cell
cell

models/software/pipelines

Title field Link
DeepFRI: Protein function prediction cell https://github.com/flatironinstitute/DeepFRI
MSA Transformer: Generate synthetic proteins statistically similar https://doi.org/10.7554/eLife.79854 cell
GenProSeq: Generating Protein Sequences with Deep Generative Models cell GenProSeq: Generating Protein Sequences with Deep Generative Models
ESM models (META/facebook): Evolutionary Scale Modeling. cell cell
EvoMIL: Prediction of virus-host association. Prediction of virus-host association using protein language models and multiple instance learning EvoMIL: Prediction of virus-host association. Prediction of virus-host association using protein language models and multiple instance learning EvoMIL: Prediction of virus-host association. Prediction of virus-host association using protein language models and multiple instance learning
ExamPle: Explainable deep learning framework for the prediction of plant small secreted peptides. cell ExamPle: Explainable deep learning framework for the prediction of plant small secreted peptides.
SuMD: Supervised Molecular Dynamics Simulations. cell SuMD: Supervised Molecular Dynamics Simulations.
S4PRED: A tool for accurate prediction of a protein's secondary structure from only its amino acid sequence with no evolutionary information i.e. MSA required cell https://github.com/psipred/s4pred
LinearDesign: Algorithm for Optimized mRNA Design Improves Stability and Immunogenicity cell https://github.com/LinearDesignSoftware/LinearDesign
gINTomics is an R package for Multi-Omics data integration and visualization cell https://github.com/angelovelle96/gINTomics
GSEA Gene Set Enrichment Analysis https://www.gsea-msigdb.org/gsea/doc/GSEAUserGuideFrame.html
Enformer: Effective gene expression prediction from sequence by integrating long-range interactions cell https://www.nature.com/articles/s41592-021-01252-x
cell

web-tools

Title field Link
VOSviewer: software tool for constructing and visualizing bibliometric networks Content Cell VOSviewer: software tool for constructing and visualizing bibliometric networks
Consensus: Evidence-Based Answers, Faster Content Cell
Connected papers: Explore connected papers in a visual graph cell Connected papers: Explore connected papers in a visual graph
enrichr comprehensive gene set enrichment analysis web server https://maayanlab.cloud/Enrichr/
string Protein-Protein Interaction Networks
Functional Enrichment Analysis https://string-db.org/
cell

Books

Title field Link
Data Analysis in Medicine and Health using R Content Cell https://bookdown.org/drki_musa/dataanalysis/
Content Cell

Blogs

Title field Link
codon cell https://www.readcodon.com/
Liams' Blog cell https://liambai.com/
single cell best practices cell https://www.sc-best-practices.org/preamble.html
cell
cell
cell

Blog articles

Title field Link
AlphaFold 2 is here: what’s behind the structure prediction miracle cell https://www.blopig.com/blog/2021/07/alphafold-2-is-here-whats-behind-the-structure-prediction-miracle/
The big problems cell https://www.science.org/content/blog-post/big-problems
Why we didn’t get a malaria vaccine sooner Content Cell https://worksinprogress.co/issue/why-we-didnt-get-a-malaria-vaccine-sooner
How to represent a protein sequence Content Cell https://liambai.com/protein-representation/
What we can learn from evolving proteins Content Cell https://liambai.com/protein-evolution/
Content Cell
Content Cell
Content Cell

papers

Title field Link
Generative models for protein structure: A comparison between Generative Adversarial and Autoregressive networks. cell https://webthesis.biblio.polito.it/15944/
cell
cell
cell

Tutorials/guides

Title field Link
Introduction to RNA-Seq using high-performance computing Content Cell https://hbctraining.github.io/Intro-to-rnaseq-hpc-O2/)https://hbctraining.github.io/Intro-to-rnaseq-hpc-O2/
Introduction to Single-Cell Analysis with Bioconductor Content Cell https://bioconductor.org/books/release/OSCA/
Orchestrating Single-Cell Analysis with Bioconductor Authors: Robert Amezquita [aut], Aaron Lun [aut], Stephanie Hicks [aut], Raphael Gottardo [aut], Alan cell Orchestrating Single-Cell Analysis with Bioconductor Authors: Robert Amezquita [aut], Aaron Lun [aut], Stephanie Hicks [aut], Raphael Gottardo [aut], Alan
Development of an Online Laboratory Handbook and Remote Workflow for Chemistry and Pharmacy Master’s Students to Undertake Computer-Aided Drug Design. cell Development of an Online Laboratory Handbook and Remote Workflow for Chemistry and Pharmacy Master’s Students to Undertake Computer-Aided Drug Design.
Contact maps cell Contact maps
TeachOpenCADD: Virtual screening. cell https://projects.volkamerlab.org/teachopencadd/all_talktorials.html, https://volkamerlab.org/projects/teachopencadd/
GROMACS tutorial cell http://www.mdtutorials.com/gmx/
ESM models (META/facebook): Evolutionary Scale Modeling. cell ESM models (META/facebook): Evolutionary Scale Modeling.
The Art of Command Line cell https://github.com/crazyhottommy/the-art-of-command-line
RNA-seq analysis cell https://github.com/crazyhottommy/RNA-seq-analysis, https://bioinformatics-core-shared-training.github.io/cruk-summer-school-2018/RNASeq2018/html/06_Gene_set_testing.nb.html
intermediate python cell https://github.com/crazyhottommy/intermediatePython
gene set enrichment analysis in R cell https://bioinformaticsbreakdown.com/how-to-gsea/, https://brb.nci.nih.gov/BRB-ArrayTools/RPackagesAndManuals/GSEA-vignettes.html, https://rpubs.com/jrgonzalezISGlobal/enrichment, https://learn.gencore.bio.nyu.edu/rna-seq-analysis/gene-set-enrichment-analysis/ , https://www.biostars.org/p/467197/, https://sbc.shef.ac.uk/workshops/2019-01-14-rna-seq-r/rna-seq-gene-set-testing.nb.html, https://www.sc-best-practices.org/conditions/gsea_pathway.html
Reactome enrichment analysis cell http://yulab-smu.top/biomedical-knowledge-mining-book/reactomepa.html
Analysis of single cell RNA-seq data cell https://www.singlecellcourse.org/, https://hbctraining.github.io/scRNA-seq/, https://learn.gencore.bio.nyu.edu/single-cell-rnaseq/, https://training.galaxyproject.org/training-material/topics/single-cell/tutorials/scrna-intro/slides-plain.html
Next-Generation Sequencing Analysis Resources cell https://learn.gencore.bio.nyu.edu/
single cell best practices cell https://www.sc-best-practices.org/preamble.html, https://github.com/theislab/single-cell-best-practices/tree/development
Bioinformatics Education and Tutorials cell https://bioinformaticshome.com/bioinformatics_tutorials/tutorials_main.html, https://www.hadriengourle.com/tutorials/, https://ghtf.biochem.uci.edu/bioinformatics-tutorials/, https://www.melbournebioinformatics.org.au/tutorials/, https://libguides.marquette.edu/c.php?g=36753&p=233518, https://bioinformatics.uconn.edu/resources-and-events/tutorials-2/, https://abacus.bates.edu/bioinformatics1/, https://www.embl.org/ells/teachingbase/dna-barcoding-resource/bioinformatics-tutorials/
NAMD and VMD for molecular dynamics cell http://www.ks.uiuc.edu/Training/Tutorials/#namd, http://www.ks.uiuc.edu/Training/Tutorials/#vmd
NGS: illumina cell https://emea.illumina.com/science/technology/next-generation-sequencing/beginners/tutorials.html
Introduction to Long-Read Data Analysis cell https://timkahlke.github.io/LongRead_tutorials/
Galaxy Training cell https://training.galaxyproject.org/
bioinfo guided tutorial cell https://www.angelfire.com/ga2/nestsite2/bioinform.html, http://lectures.molgen.mpg.de/
Getting started with bioinformatics cell https://riptutorial.com/bioinformatics
Free online courses in Bioinformatics cell https://ethz.ch/content/dam/ethz/special-interest/biol/department/Bioinformatics%20courses.pdf
NGS alignment and variant calling cell https://github.com/ekg/alignment-and-variant-calling-tutorial
bioinformatics workshops cell https://bioinformatics.ca/workshops/previous-workshops/
SIB tutorials about bioinformatics cell https://edu.sib.swiss/
Microarray Analysis cell http://barc.wi.mit.edu/education/bioinfo2007/arrays/
Kaggle's tutorials on machine learning, deep learning, data analysis cell https://www.kaggle.com/learn
pytorch tutorials cell https://pytorch.org/tutorials/
pytorch basics cell https://pytorch.org/tutorials/beginner/basics/intro.html, https://pytorch.org/tutorials/beginner/pytorch_with_examples.html
ADVANCED: MAKING DYNAMIC DECISIONS AND THE BI-LSTM CRF cell https://pytorch.org/tutorials/beginner/nlp/advanced_tutorial.html
RNA-seq **** https://scienceparkstudygroup.github.io/rna-seq-lesson/

Interesting labs

Title field Link
volkamer lab cell https://volkamerlab.org/research/
cell
cell

wet lab

Title field Link
basic methods cell https://www.youtube.com/@csberg5856/videos
cell
cell
cell

Communities

Title field Link
Rosalind (forum and platform for learning bioinformatics) cell https://rosalind.info/problems/list-view/, solutions: https://github.com/crazyhottommy/rosalind_problems_python_solutions
biostars cell https://www.biostars.org/
cell
cell