Pinned Repositories
AutoAWQ
AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
CursorCore
CursorCore: Assist Programming through Aligning Anything
CursorWeb
CursorWeb: Implement popular features of Cursor in the browser.
Deepseek-Coder-MoE
Sparse Deepseek-Coder.
Jamba.c
LightBinPack
A lightweight library for solving packing problems in LLM training
Mixtral.c
Typst-Coder
vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
TechxGenus's Repositories
TechxGenus/CursorCore
CursorCore: Assist Programming through Aligning Anything
TechxGenus/Typst-Coder
TechxGenus/CursorWeb
CursorWeb: Implement popular features of Cursor in the browser.
TechxGenus/LightBinPack
A lightweight library for solving packing problems in LLM training
TechxGenus/LightDPO
TechxGenus/aider
aider is AI pair programming in your terminal
TechxGenus/continue
⏩ Continue is the leading open-source AI code assistant. You can connect any models and any context to build custom autocomplete and chat experiences inside VS Code and JetBrains
TechxGenus/DeepSeek-R1
TechxGenus/Janus
TechxGenus/RadixFlexAttention
TechxGenus/AutoAWQ
AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
TechxGenus/accelerate
🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
TechxGenus/DeepSeek-V2-Utils
TechxGenus/DeepSeek-V3
TechxGenus/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
TechxGenus/flash-attention
Fast and memory-efficient exact attention
TechxGenus/Liger-Kernel
Efficient Triton Kernels for LLM Training
TechxGenus/llama.vscode
VS Code extension for local LLM-assisted code/text completion
TechxGenus/Medusa
Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads
TechxGenus/numba
NumPy aware dynamic Python compiler using LLVM
TechxGenus/Pattention
TechxGenus/pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
TechxGenus/ray
Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
TechxGenus/ring-flash-attention
Ring attention implementation with flash attention
TechxGenus/sglang
SGLang is yet another fast serving framework for large language models and vision language models.
TechxGenus/TinyZero
Clean, minimal, accessible reproduction of DeepSeek R1-Zero
TechxGenus/torchtitan
A PyTorch native library for large model training
TechxGenus/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
TechxGenus/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
TechxGenus/vscode
Visual Studio Code