BioGeek
Staff Research engineer at @instadeepai – Machine Learning for personalized cancer vaccines, de novo peptide sequencing and signal peptides.
@instadeepaiCape Town, South Africa
Pinned Repositories
50-examples
Contains the text of a book describing interesting examples of Python programming for use in teaching. An alternate title was "The Python Teaching Cookbook".
adventofcode
My solutions of Advent Of Code in Python
aima
Python implementation of algorithms from Russell and Norvig's Artificial Intelligence: A Modern Approach.
euler
My solutions to Project Euler
hackathon_indaba_senegal_2024
hackathon_indabaX_2024
Data and starter notebooks for the locust breeding ground prediction hackaton
hackathon_IndabaX_2025_mlip
magic
Scanner for decks of cards with bar codes printed on card edges
InstaNovo
De novo peptide sequencing with InstaNovo: Accurate, database-free peptide identification for large scale proteomics experiments
winnow
Confidence control and FDR estimation for de novo peptide sequencing
BioGeek's Repositories
BioGeek/euler
My solutions to Project Euler
BioGeek/hackathon_IndabaX_2025_mlip
BioGeek/os-proteomics
View point in Open source in Proteomics
BioGeek/bioconda-recipes
Conda recipes for the bioconda channel.
BioGeek/bpe-match
BioGeek/deep-representation-learning-book
Learning Deep Representations of Data Distributions
BioGeek/DEgym
A LLM-friendly framework for translating dynamical equations to gymnasium-compatible RL environments.
BioGeek/DeNovo_Benchmark
Benchmarking several SOTA de novo sequencing tools on the metaproteomics dataset
BioGeek/denovo_benchmarks
BioGeek/example-get-started-experiments
Get started DVC project
BioGeek/flash-attention
Fast and memory-efficient exact attention
BioGeek/GlycoNovo
BioGeek/gpt-oss
gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI
BioGeek/InstaGeo-E2E-Geospatial-ML
A python package for end-to-end geospatial machine learning using multispectral earth observation data such as NASA HLS.
BioGeek/InstaNovo
BioGeek/MassNet-DDA
MassNet-DDA
BioGeek/metalic
Meta in-context learning for protein fitness prediction
BioGeek/mlip
Library for efficient training and application of Machine Learning Interatomic Potentials (MLIP)
BioGeek/ModernBERT
Bringing BERT into modernity via both architecture changes and scaling
BioGeek/mojo-gpu-puzzles
Learn GPU Programming in Mojo🔥 by Solving Puzzles
BioGeek/mzpeaks
A Rust crate for mass spectral peaks
BioGeek/nanochat
The best ChatGPT that $100 can buy.
BioGeek/Ninikske
BioGeek/nucleotide-transformer
🧬 Nucleotide Transformer: Building and Evaluating Robust Foundation Models for Human Genomics
BioGeek/PDV
PDV: an integrative proteomics data viewer
BioGeek/protein-structure-tokenizer
Official implementation of "Learning the language of protein structures"
BioGeek/psi-ms-CV
HUPO-PSI mass spectrometry CV
BioGeek/SimMS
GPU-Accelerated Cosine Similarity for Tandem Mass Spectrometry
BioGeek/staged-recipes
A place to submit conda recipes before they become fully fledged conda-forge feedstocks
BioGeek/torchtitan
A PyTorch native platform for training generative AI models