ferdiko

ferdi.kossmann@gmail.com

Cambridge, MA

Pinned Repositories

exllama
A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.
Language:Python00
fastmoe
A fast MoE impl for PyTorch
Language:Python00
FlexFlow
A distributed deep learning framework that supports flexible parallelization strategies.
Language:C++0 0 00
MI-for-NAS
Mutual information for fine-grained network analysis in neural architecture search
0 1 00
MMST-spelling-correction
A novel, context-sensitive spelling corrector that uses clusterings in GloVe embeddings as a learned notion of context.
Language:Python0 1 00
video-etl
Supplementary material to paper "Extract-Transform-Load for Video Streams"
Language:Python61
FlexFlow
FlexFlow Serve: Low-Latency, High-Performance LLM Serving
Language:C++1.7k 33 653223
lab0
Go tutorial for the MIT DB class
Language:Go2 14 09
brad
A virtualization layer for cloud data infrastructures.
Language:Python4 7 1701
vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python27.7k 227 4.7k4.1k

ferdiko/video-etl
Supplementary material to paper "Extract-Transform-Load for Video Streams"
Language:Python61
ferdiko/exllama
A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.
Language:Python00
ferdiko/fastmoe
A fast MoE impl for PyTorch
Language:Python00
ferdiko/FlexFlow
A distributed deep learning framework that supports flexible parallelization strategies.
Language:C++0 0 00
ferdiko/MI-for-NAS
Mutual information for fine-grained network analysis in neural architecture search
0 1 00
ferdiko/MMST-spelling-correction
A novel, context-sensitive spelling corrector that uses clusterings in GloVe embeddings as a learned notion of context.
Language:Python0 1 00