hsm1997

Pinned Repositories

DeepSeek-V2
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
3.3k 26 81125
diffq
DiffQ performs differentiable quantization using pseudo quantization noise. It can automatically tune the number of bits used per weight or group of weights, in order to achieve a given trade-off between model size and accuracy.
Language:Python00
involution
[CVPR 2021] Involution: Inverting the Inherence of Convolution for Visual Recognition, a brand new neural operator
Language:Python00
onnxruntime-inference-examples
Examples for using ONNX Runtime for machine learning inferencing.
Language:Python00
vision
Datasets, Transforms and Models specific to Computer Vision
Language:Python00
vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python03
streaming-llm
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
Language:Python6.5k 63 79361
SWE-bench
[ICLR 2024] SWE-Bench: Can Language Models Resolve Real-world Github Issues?
Language:Python1.7k 24 128286
mamba
Mamba SSM architecture
Language:Python12.3k 100 4841k

hsm1997's Repositories

hsm1997/diffq
DiffQ performs differentiable quantization using pseudo quantization noise. It can automatically tune the number of bits used per weight or group of weights, in order to achieve a given trade-off between model size and accuracy.
Language:Python00
hsm1997/involution
[CVPR 2021] Involution: Inverting the Inherence of Convolution for Visual Recognition, a brand new neural operator
Language:Python00
hsm1997/onnxruntime-inference-examples
Examples for using ONNX Runtime for machine learning inferencing.
Language:Python00
hsm1997/vision
Datasets, Transforms and Models specific to Computer Vision
Language:Python00
hsm1997/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python03