jianshu93
Probabilistic data structures in Bioinformatics and Computational Biology. Collaboration with @jean-pierreBoth
University of California, San DiegoSan Diego
jianshu93's Stars
microsoft/mimalloc
mimalloc is a compact general purpose allocator with excellent performance.
timescale/pgvectorscale
A complement to pgvector for high performance, cost efficient vector search on large workloads.
DeepGraphLearning/graphvite
GraphVite: A General and High-performance Graph Embedding System
purpleprotocol/mimalloc_rust
A Rust wrapper over Microsoft's MiMalloc memory allocator
ingowald/cudaKDTree
ParAlg/gbbs
GBBS: Graph Based Benchmark Suite
cmuparlay/ParlayANN
A library of algorithms for approximate nearest neighbor search in high dimensions, along with a set of useful tools for designing such algorithms.
TutteInstitute/evoc
Embedding Vector Oriented Clustering
vsbuffalo/granges
A Rust library and command line tool for working with genomic ranges and their data.
jackh726/bigtools
A high-performance BigWig and BigBed library in Rust
natir/yacrd
Yet Another Chimeric Read Detector
gtonkinhill/fastbaps
A fast approximation to a Dirichlet Process Mixture model (DPM) for clustering genetic data
natir/fpa
Filter of Pairwise Alignement
broadinstitute/poasta
Fast and exact gap-affine partial order alignment
Lyn-liyuan/ndarray-cuda-matmul
a high-performance computing solution designed to accelerate matrix operations using Nvidia's CUDA technology with Rust's ndarray data structure.
natir/biotest
Generate random test data for bioinformatics
RagnarGrootKoerkamp/minimizers
Reference implementations of minimizer schemes to go with the mod-minimizers paper.
inbalpaz/CLANS
CLANS_2 is a Python-based program for clustering sequences in the 2D or 3D space, based on their sequence similarities. CLANS visualizes the dynamic clustering process and enables the user to interactively control it and explore the cluster map in various ways.
Daniel-Liu-c0deb0t/dlb-kmer-sampling
Optimal distance lower bound k-mer sampling.
THU-numbda/SketchNE
Embedding billion-scale networks accurately in one hour (TKDE paper 2023)
bluenote-1577/sce-aligner
A basic seed-chain-extend aligner with linear-gap cost chaining and quadratic time extension for experiments
geon0325/VilLain
Source code for WWW 2024 paper "VilLain: Self-Supervised Learning on Homogeneous Hypergraphs without Features via Virtual Label Propagation."
hiql/imohash
Fast hashing for large files
OrensteinLab/DecyclingSetBasedMinimizerOrder
Code and software for minimum-decycling-set-based minimizer orders
Shao-Group/SubseqHash2
Fast, SIMD-accelerated, multi-set seeding algorithm for error-prone sequences
KoslickiLab/prokrustean
bluenote-1577/basic_seed_chainer
dyxstat/ImputeCC
ImputeCC enhances integrative Hi-C-based metagenomic binning through constrained random-walk-based imputation
natir/dynseq
A dynamic representation of DNA sequence.
pegesund/pgvectorscale
A complement to pgvector for high performance, cost efficient vector search on large workloads.